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CHAPTER 1 
AN INTRODUCTION TO FLOATING POINT 



1.1 WHAT IS A FLOATING POINT NUMBER? 

The numbers we encounter every day, such as 12, 34.56, 0.0789, 
etc., are known as fixed point numbers because the decimal point 
is in a fixed position. Such numbers are fairly closely matched in 
magnitude and within about ten orders of magnitude from unity. 
Examples of such numbers are found in banl< accounts, unit 
prices of store items and paychecks. 

In scientific applications, the numbers encountered can be very 
large. Avogadro's number expressed in fixed point notation is 
approximately 602,250,000,000,000,000,000,000. A scientist 
may also use Planck's constant which would be approximately 
0.000000000000000000000000006626196 erg sec in fixed point 
notation. These examples demonstrate the undesirability of writ- 
ing fixed point notation and why most scientists use the concise 
floating point notation to represent numbers such as Avogadro's 
number and Planck's constant. 

When a scientist writes the value of Avogadro's number, he writes 
6.0225 X 1023. Similarly he would express Planck's constant as 
6.626196 x 10-27 erg sec. 

As we can observe, the number +6.0225 x 1023, consists of 4 
parts: 

Sign - 

The sign of the number (+ or -). The plus sign is usually 
assumed when no sign is shown. 
Mantissa - 

Sometimes also known as the fraction. The mantissa describes 
the actual number. In the example, the mantissa is 6.0225. 
Exponent - 

Sometimes also known as the characteristic. The exponent 
describes the order of magnitude of the number. In the exam- 
ple, the exponent is 23. 
Base - 

Sometimes also known as the radix. The base is the number 
base in which the exponent is raised. In the example, the base 
is 10. 



The parts of a floating point number can then be represented by 
the following equation: 

F= (-l)Sx Mx BE 

where 

F = floating point number 

S = sign of the floating point number, so that S = if the 
number is positive and S = 1 if the number is negative 
M = mantissa of the floating point number 
B = base of the floating point number 
E = exponent of the floating point number 

1.2 WHEN SHOULD FLOATING POINT BE USED? 

Although floating point numbers are useful when numbers of very 
different magnitude are used, they should not be used indiscrim- 
inately. There is an inherent loss of accuracy and increased 
execution time for floating point computations on most compu- 
ters. Floating point computation suffers the greatest loss of ac- 
curacy when two numbers of closely matched magnitude are 
subtracted from each other or two numbers of opposite sign but 
almost equal magnitude are added together. Therefore, the As- 
sociative Law in arithmetic 

A 4- (B + C) = (A + B) 4- C 

does not always hold true if B is of opposite sign to A and C and 
very similar in magnitude to either A or C. 

In most computers, hardware floating point multiply and divide 
takes approximately the same amount of execution time as 
hardware fixed point multiply and divide, but hardware floating 
point add and subtract usually takes considerably more time then 
hardware fixed point add and subtract. If the computer lacks 
floating point hardware, all floating point computations will con- 
sume more CPU time than fixed point computations. 



CHAPTER 2 
FLOATING POINT FORMATS 



2.1 COMMONLY USED FLOATING POINT BASES 

The following three number bases are commonly used in floating 
point number systems: 

1) Binary - The base is 2. 

2) Binary Code Decimal (BCD) - The base is 10. 

3) Hexadecimal - The base is 16. 



2.2 COMPARISONS OF THE THREE 
COMMONLY USED BASES 

Binary - 

The main advantages of the binary floating point format are 
relative ease of hardware implementation and maximum ac- 
curacy for a given number of bits. On the negative side, the 
conversion of an ASCII (American Standard Code for Informa- 
tion Interchange) decimal string to and from a binary floating 
number is difficult and time consuming. In commercial applica- 
tions where input and output are always decimal character 
strings, the binary floating point numbers will have an inherent 
rounding error because numbers such as O.l-io cannot be 
represented exactly with a binary floating point number. 



BCD - 

The advantages and disadvantages of the BCD floating point 
numbers are just the opposite of the binary floating point num- 
bers. BCD floating point is most commonly used in commercial 
applications where the computations involved are usually sim- 
ple and input/output is always in the form of decimal ASCII 
strings. 



Hexadecimal - 

The hexadecimal floating point numbers have similar advan- 
tages and disadvantages as the binary floating point when 
compared with the BCD floating point format. When the same 
number of bits of exponent and mantissa are used, the 
hexadecimal floating point gives a considerably larger dynamic 
range than the binary floating point format. For example, for a 
7-bit exponent, the largest positive number that can be rep- 
resented in the hexadecimal floating point is approximately 
1664 (approximately 1.16 x 10^7. The smallest non-zero posi- 
tive number that can be represented is 16-64 (approximately 
8.64 x 10-78). By comparison, the largest and smallest positive 
numbers that can be represented in a 7-bit exponent binary 
system are approximately 1.84 x 1019 and 5.42 x 10-20 re- 
spectively. 



An advantage of the hexadecimal floating point system over the 
binary point system is that during normalization and denormali- 
zation of the floating point numbers the hexadecimal system 
requires far fewer shifts compared with the binary system, be- 
cause the hexadecimal system shifts four places at a time and 
most binary systems shift only one place at a time. For more 
sophisticated systems where normalization and denormalization 
can be done in one operation, this advantage does not exist. Most 
present-day systems do not fall in this category. 

This disadvantage of the hexadecimal system is the loss of preci- 
sion as compared with the binary system when the number of 
mantissa bits are the same. Since the three most significant bits 
could be zero when the first digit of the hexadecimal is a 1, this 
leads to a loss of 3 bits of accuracy in the worst case. However, 
assuming uniform distribution of numbers, the average loss of 
accuracy is only 11/15 bits. The above comparison assumes the 
binary system does not use an "implied 1 " (Section 2.4). The loss 
of accuracy in a hexadecimal system compared with a binary 
system using an "implied 1" and same number of bits of mantissa 
is 4 bits in the worst case and 1 and 11/15 bits on the average. 

2.3 DIFFERENT EXPONENT FORMATS 

Two types of exponents used in floating point number systems 
are the biased exponent and the unbiased exponent. An un- 
biased exponent has a two's complement number. An exponent 
said to be biased by N (or excess N notation), means that the 
coded exponent is formed by adding N to the actual exponent in 
two's complement form. Any overflow generated from the addi- 
tion is ignored. The result becomes an unsigned number. Most 
common floating point systems use a biased exponent. Biased 
exponents are used to simplify floating point hardware. During 
floating point computations, arithmetic operations such as add 
and subtract need to be performed on the exponents of the 
operands. If a biased exponent is used, the arithmetic logic unit 
(ALU) needs only to perform unsigned arithmetic. If an unbiased 
exponent is used, the ALU must perform two's complement 
arithmetic, and overflow conditions are more difficult to detect. 

2.4 "IMPLIED 1" 

Most floating point numbers must always be presented to the 
computer in "normalized" form (i.e., the most significant digit of 
the mantissa is always non-zero, except if the number is zero). 
For a binary floating point system, this would mean the leading 
binary bit of the mantissa is always 1 (except when the number is 
zero). In some floating point number systems, such as Am9512 
format, this 1 bit is not represented on input or output to the 
floating point processor. The extra bit can be used for one more 
bit of precision or one more bit of exponent range. 



CHAPTER 3 
FLOATING POINT ARITHMETIC 



3.1 INTRODUCTION 

This chapter describes the basic principles of performing arith- 
metic with floating point numbers. First, the internal mechanism of 
floating point is analyzed. The following discussion uses the 
Am9512 single precision format although the discussion can 
apply to other formats with only minor modifications. The 
operands are assumed to be located in a stack. The first operand 
is called TOS (top of stack) and the second operand is called NOS 
(next on stack). 

3.2 FLOATING POINT ADD AND SUBTRACT 

Floating point add and subtract use essentially the same al- 
gorithm. The only difference is that floating point subtract 
changes the sign of the floating point number at top of stack and 
then performs the floating point add. 



The following is a step-by-step description of a floating point add 
algorithm (Figure 3.1): 

a. Unpack TOS and NOS. 

b. The exponent of TOS is compared to the exponent of NOS. 

c. If the exponents are equal, go to step f. 

d. Right-shift the mantissa of the number with the smaller expo- 
nent. 

e. Increment the smaller exponent and go to step b. 

f. Set sign of result to sign of larger number. 

g. Set exponent of result to exponent of larger number, 
h. If sign of the two numbers are not equal, go to m. 

i. Add mantissas. 

j. Right-shift resultant mantissa by 1 and increment exponent of 
result by 1. 
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Figure 3.1. Floating Point Add/Subtract Flowchart 
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k. If the most significant bit (MSB) of exponent cfianges from 1 to 
as a result of the Increment, set overflow status. 

I. Round if necessary and exit. 

m. Subtract smaller mantissa from larger mantissa. 

n. Left-shift mantissa and decrement exponent of result. 

o. If MSB of exponent changes from to 1 as a result of the 
decrement, set underflow status and exit. 

p. If the MSB of the resultant mantissa = 0, go to n. 

q. Round if necessary and exit. 

3.3 FLOATING POINT MULTIPLY 

Floating point multiply basically involves the addition of the expo- 
nents and multiplication of the mantissas. The following is a 
step-by-step description of a floating point multiplication al- 
gorithm (Figure 3.2): 

a. Check if TOS or NOS = 0. 

b. If either TOS or NOS = 0, Set result to and exit. 

c. Unpack TOS and NOS. 



Convert EXP (TOS) and EXP (NOS) to unbiased form: 

EXP (TOS) = EXP (TOS) - 127io 

EXP (NOS) = EXP (NOS) - 127io 
Add exponents: 

EXP = EXP (TOS) + EXP (NOS) 
If MSB of EXP (TOS) = MSB of EXP (NOS) = and MSB of 
EXP = 1 , then set overflow status and exit. 
If MSB of EXP (TOS) = MSB of EXP (NOS) = 1 and MSB of 
EXP = 0, then set underflow status and exit. 
Convert exponent back to biased form: 

EXP = EXP + 127io 
If sign of TOS = sign of NOS, set sign of result to 0; otherwise , 
set sign of result to 1. 
Multiply mantissas. 

If MSB of resultant mantissa = 1 , right-shift mantissa by 1 and 
increment exponent of resultant. 

If MSB of exponent changes from 1 to as a result of the 
increment, set overflow status. 
Round if necessary and exit. 
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Figure 3.2. Floating Point IMultiply Flowchart 
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3.4. FLOATING POINT DIVIDE 

The floating point divide basically involves the subtraction of 
exponents and the division of mantissas. The following is a step- 
by-step description of a division algorithm (Figure 3.3): 

a. if TOS = 0, set divide exception error and exit. 

b. If NOS = 0, set result to and exit. 

c. Unpacl< TOS and NOS. 

d. Convert EXP (TOS) and EXP (NOS) to unbiased form; 

EXP (TOS) = EXP (TOS) - 127io 
EXP (NOS) = EXP (NOS) - 127io 

e. Subtract exponent of TOS from exponent of NOS: 

EXP = EXP (NOS) - EXP (TOS) 

f. If MSB of EXP (NOS) = 0, MSB of EXP (TOS) = 1, and MSB 
of EXP = 1, then set overflow status and exit. 



If MSB of EXP (NOS) = 1, MSB of EXP (TOS) = 0, and MSB 
of EXP = 0, then set underflow status and exit. 
Add bias to exponent of result: 

EXP = EXP + 127io 
If sign of TOS = sign of NOS, set sign of result to 0, else set 
sign of result to 1. 

Divide mantissa of NOS by mantissa of TOS 
If MSB = 0, left-shift mantissa and decrement exponent of 
resultant, or else go to n. 

If MSB of exponent changes from to 1 as a result of the 
decrement, set underflow status. 
Go to k. 
Round if necessary and exit. 
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Figure 3.3. Floating Point Divide Flowchart 
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CHAPTER 4 
DATA CONVERSION 



4.1 INTRODUCTION 

This chapter describes how to convert fixed point binary integer to 
floating point, floating point to fixed point binary integer, decimal 
ASCII (American Standard Code for Information Interchange) 
string to floating point and floating point to decimal ASCII string. 
These conversion methods are useful because few real-world 
inputs and outputs are in floating point format. When human 
interface is involved, the real-world interface is usually a decimal 
ASCII string. If the data are collected through some automatic 
means such as an A/D converter, counters, etc., the input is 
usually in the form of fixed point binary or BCD integers. In this 
chapter, the floating point format is assumed to be the Am9512 
single precision format. 

4.2 BINARY FIXED POINT TO FLOATING POINT 

The input to this routine is assumed to be a 32-bit two's comple- 
ment number and the output is a binary floating point number of 



Am9512 format. Figure 4.1 shows the flow chart of such a pro- 
gram and Figure 4.2 shows an Am9080A assembly language 
subroutine that accomplishes this task. 

The data format used in the assembly language conversion is as 
follows: 

Fixed Point - 

Two's complement number that occupies 4 consecutive mem- 
ory locations with the most significant byte residing in low 
memory. To address the number, the pointer points to the low 
address. 

Floating Point - 

Am9512 floating point format that occupies 4 consecutive 
memory locations. The sign and 7 bits of the exponent resides 
in the low address. To address the number, the pointer points to 
the low address. 



[ START 


~) 






FLOAT = FIX 
EXP = 150io 
SIGN = 


^"^ FLOAT 


= 0? 


■^ 



BIT 23-31 
= EXP 



BIT 31 
= SIGN 



SIGN = 1 
FLOAT = - FLOAT 





LEFT SHIFT 

FLOAT 

EXP = EXP - 1 



RIGHT SHIFT 

FLOAT 

EXP = EXP ■=■ 1 



{ RETURN J 



Figure 4.1. Fix to Float Conversion Flowchart 
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LOG OBJ 



LINE 



SOURCE STATEMENT 



0000 C5 

0001 D5 

0002 E5 
0002 CD0000 

0006 EB 

0007 CD0000 
000A CA4D00 



000II 0600 
000F 0E96 



0011 7E 

0012 B7 

0013 F21B00 



0016 0680 
0018 CD0000 



001E 7E 
001C B7 
001D CA2C00 



1 $ 
2 

4 

5 

6 

7 

8 

9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 

25 FXTOFL: 
26 
27 
28 
29 
30 
31 

32 ; 

33 ; 

34 ; 

35 
36 

37 ; 

38 ; 

39 ; 

40 
41 
42 

43 ; 

44 ; 

45 ; 

46 ; 

47 
48 

49 ; 

50 ; 

51 ; 

52 FX10: 
53 

54 



PAGEWIDTF(80) MACHOFILE 

SUBROUTINES TO CONVERT FIX TO FLOAT 
AND FLOAT TO FIX POINT FORMATS 

NAME CONTT 

PUBLIC FXTOFL, FLTOFX 

EXTRN QMOVE,0TEST,QNEG,QLSL,QLSR,QCLR 

CSEG PAGE 

FIX TO FLOAT CONVERSION ROUTINE 

TO CALL THE PROGRAM, 

HL = POINTER TO TEE FIXED POINT NUMBER 

DE = POINTER TO TEE FLOATING POINT NUMBER 

ACC AND PSW ARE ALTERED BY THE SUBROUTINE 

ALL OTHER REGISTERS ARE NOT DISTURBED 

PUSH B ;SAVE BO REGISTER PAIR 

PUSH D JSAVE DESTINATION POINTER 

PUSH H ;SAVE SOURCE POINTER 

CALL OMOVE ;COPT FIXED PT NO. INTO FLOAT 

XCHG ;PDT FLOAT POINTER IN HL 

CALL QTEST JTEST IF NO . = 0? 

JZ RETN ;YES - JUMP 

THE NUMBER IS NOT ZERO, INIT. SIGN AND EXP 

BIAS 



MVI B,0 

MVI C, 23+127 



;B REG = SIGN 

;C REG = EXPONENT 



TEST IF THE NUMBER IS NEGATIVE 



MOV A,M 
ORA A 
JP FX10 



;GET MSB FROM FLOAT 

;SET FLAGS 

iJUMP IF NO. IS POSITIVE 



THE FIXED POINT NUMBER IS NEGATIVE 
NEGATE NUMBER AND SET SIGN = 1 

MVI B,80H ;SET SIGN TO 80B 

CALL QNEG ; NEGATE NUMBER IN FLOAT 

TEST IF MOST SIGNIFICANT BYTE OF FLOAT = 



MOV A,M 
ORA A 
JZ FX20 



;GET MSB OF FLOAT 
;SET FLAGS 
;JUMP IF MSB = 



Figure 4.2. Float to Fix Conversion Flowchart 
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LOG OBJ 



0020 0C 

0021 CD0000 

0024 7E 

0025 B7 

0026 C22000 
0029 C33B00 



002C 54 
002D 5D 
0021 13 
002F lA 

0030 B7 

0031 FA3B00 

0034 0D 

0035 CD0e00 
0038 C32F00 



003B lA 
003C E67F 
003E 12 
003F 79 

0040 0F 

0041 4F 

0042 E680 

0044 EB 

0045 B6 

0046 77 

0047 EB 

0048 79 

0049 E67F 
004B B0 
004C 77 



LINE 

55 ; 

56 ; 

57 ; 

58 FX15: 
59 

60 
61 
62 
63 

64 ; 

65 ; 

66 ; 

67 FX20: 
68 

69 

70 FX25: 

71 

72 

73 

74 

75 

76 ; 

77 ; 

78 ; 

79 FX30: 
80 

81 
82 
83 
84 
85 
86 
87 

P 
88 
89 
90 
91 
92 
93 



SOURCE STATEMENT 



MSB NOT ZERO, RIGHT SHIFT REQUIRED 

INK C ;INC. EXP BT 1 

CALL QLSR ; LOGICAL SHIFT RIGHT OF FLOAT 

MOT A,M JTEST IF MSB = 

OR A A ;SET FLAGS 

JNZ FX15 ;N0T ZERO, SHIFT SOME MORE 

JMP FX30 ;ZERO, SHIFT COMPLETE 

MSB = 0, TEST IF LEFT SHIFT REQUIRED 

MOV D,H 

MOV E,L ;PUT FLOAT POINTER INTO DE 

INX D {POINT TO NEXT MSB OF FLOAT 

IDAX D ;GET next MSB 

ORA A ;SET FLAGS 

JM FX30 ;D0NE IF BIT 23 = 1 

DCR C ;DEC. EXP BT 1 

CALL QLSL {LOGICAL LEFT SHIFT OF FLOAT 

JMP rx25 ;trt again 

SHIFT COMPLETE, MANTISSA FORMED IN FLOAT 



LDAX D 
ANI 7FH 
STAX D 
MOV A,C 
PRC 

MOV C ,A 
ANI 80H 
XCHG 
ORA M 

MOV M,A 
XCHG 
MOV A,C 
ANI 7FH 
ORA B 
MOV M,A 



;GET next MSB OF FLOAT 
;STRIP OFF HIDDEN "l" 

;put it back in memory 

;get exponent 

jrotate right 

;put rotated exp. bacx in c 

; extract lsb of exponent 

;put next msb pointer in hl 

jcombine msb of mantissa with ex 



;REST0RE POINTERS 
jGET ROTATED EXPONENT 
; STRIP OF LSB 
;COMBINE EXP WITH SIGN 
;SET MSB OF FLOAT 







94 
95 


CONVERSION COMPLETE, RETURN TO CALLER 






96 




004E 


El 


97 RETN: POP H ;REST0RE ALL REGISTERS 


004E 


Dl 


98 


POP D 


004F 


CI 


99 


POP B 


0050 


C9 


100 
101 


RET ; RETURN TO CALLER 






102 


FLOAT TO FIX CONVERSION ROUTINE 






103 


TO CALL THE PROGRAM 






104 


HL = POINTER TO THE FLOATING POINT NUMBER 






105 


IE = POINTER TO THE FIXED POINT NUMBER 






106 


ON RETURN 






107 


A REG = AND Z FLAG = 1 IF NO ERROR 






108 


A = 1 AND Z FLAG = IF OVERFLOW ERROR 



Figure 4.2. Float to Fix Conversion Flowcliart (Cont.) 
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LOG OBJ 



LINE 



SOURCE STATEMENT 



0051 C5 

0052 D5 

0053 E5 

0054 CD0000 
0057 CD0000 
005A CAA200 



005D 
005E 
005F 
0061 
0062 
0063 
0064 
0066 
0067 
0068 
0069 
006A 
006B 
006E 
006F 
0071 
0072 
0073 
0075 
0076 
0078 
007B 
0071) 
0080 
0082 
0085 
0086 



£S 

7E 

E680 

47 

7E 

07 

E6FE 

4r 

23 

7E 

07 

D26E00 

0C 

7E 

F680 

77 

2B 

3600 

79 

D67F 

rAA700 

FEIF 

D2AD00 

D617 

CA9A00 

4F 

DA9300 



0089 CD0000 
008C 0D 
008D C28900 

0090 C39A00 



0093 CE0000 

0096 0C 

0097 C 293 00 



009A 78 



109 ; 

110 ; 

111 FITOFX: 
112 

113 
114 
115 

lie 

117 ; 

118 ; 

119 ; 

120 
121 
122 
123 
124 
125 
126 
127 
128 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
142 
143 
144 
145 
146 

147 ; 

148 ; 

149 ; 

150 FL10: 
151 

152 
153 

154 ; 

155 ; 

156 ; 

157 FL20: 
158 

159 

160 ; 

161 ; 

162 ; 

163 FL30: 



OTHER REGISTERS ARE NOT DISTURBED 



PUSH B 
PUSH D 
PUSH H 
GAIL QUOTE 
CALL OTEST 
JZ FL40 



;SAVE ALL REGISTERS 



;COPT FLOAT TO FIX 
;TEST if INPUT NO. = 0? 
;PETURN IF INPUT IS ZERO 



EXTRACT SIGN AND EXPONENT FROM FLOATING PT NO. 



XCHG 
MOV A,M 
ANI 80H 
MOV B,A 
MOV A,M 
BLC 

ANI 0FEE 
MOV C ,A 
INX H 
MOV A,M 
PLC 

JNC $+4 
INR C 
MOV A,M 
OR I 80H 
MOV M,A 
ECX H 
MVI M,0 
MOV A,C 
SUI 127 
JM ZERO 
CPI 31 
JNC OVFL 
SUI 23 
JZ rL30 
MOV C,A 
JC FL20 



HL POINTS TO FIX 

GET MSB 

EXTRACT SIGN BIT 

SAVE SIGN IN B 

GET MSB AGAIN 

MULTIPLY BT 2 

STRIP OF LSB 

SAVE IN C 

POINT TO NEXT MSB 

GET NEXT MSB 

MOVE LSB OF EXP INTO CARRY 

SKIP IF NO CARRY 

PROPAGATE CARRY INTO EXP 

GET NEXT MSB 

SET HIDDEN BIT 

RESTORE NEXT MSB 

NOW EL POINTS TO MSB AGAIN 

CLEAR MSB 

GET BIASED EXPONENT 

STRIP OFF BIAS 

EXP < 0, RETURN ZERO AS RESULT 

CHECK IF EXP > 31 

JUMP IF NUMBER IS TOO LARGE 

SUBTRACT EXP BY 23 

NO SHIFT REQUIRED, CHECK SIGN 

SAVE SHIFT COUNT 

COUNT < 0, RIGHT SHIFT 



COUNT > 0, LEFT SHIFT REQUIRED 

CALL QLSL ;L0GICAL SHIFT LEFT 
DCR C 
JNZ FL10 
JMP FL30 

COUNT < 0, RIGHT SHIFT REQUIRED 

CALL QLSR ;LOGICAL SHIFT RIGHT 
INR C 
JNZ FL20 

SHIFT COMPLETE, CHECK SIGN AND EXIT 

MOV A,B ;GET SIGN 



Figure 4.2. Float to Fix Conversion Flowcliart (Cont.) 
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LOG OBJ 



LINE 



SOURCE STATEMENT 



009B 


B7 




164 




ORA A ;SET FLAGS 


009C 


F2A200 


C 


165 




JP FL40 ;PLUS SIGN, SKIP NEGATIO 


009F 


CD0000 


E 


166 
167 




CALL QNEG ;MINUS SIGN, NEGATE NUMB 








168 




CLEAR ERROR FLAG AND RETURN 








169 






00A2 


AF 




170 


FL40: 


XRA A 


00A3 


El 




171 




POP H ; RESTORE ALL REGISTERS 


00A4 


Dl 




172 




POP D 


00A5 


CI 




173 




POP B 


00A6 


09 




174 
175 




RET 








176 




ZERO FIX POINT NUMBER AND RETURN 








177 






00A7 


CD00e0 


E 


178 


ZERO: 


CALL QCLR J CLEAR FIX POINT NUMBER 


00AA 


C3A200 


C 


17S 
180 
181 
182 




JMP FL40 ; RETURN 

SET OVERFLOW FLAG AND RETURN 


00AB 


3E01 




183 


OVFL: 


MVI A,l ;SET A REG 


00AF 


B7 




184 




ORA A iSET Z FLAG 


00B0 


C3A300 


c 


185 
186 




JMP rL40+i ;kestore reg. and return 

END 



PUBLIC SYMBOLS 
FLTOFX C 0051 



FXTOFL C 0000 



EXTERNAL SYMBOLS 

QCLR E 0000 QLSL F 0000 

QNEG E 0000 QTEST E 0000 



USER SYMBOLS 






FL10 C 0089 


FL20 


C 0093 


FLTOFX C 0051 


FX10 


C 001B 


FX25 C 002F 


FX30 


C 003B 


QCLR E 0000 


QLSL 


E 0000 


QNEG E 0000 


QTEST 


E 0000 


ASSEMBLY COMPLETE, NO 


ERRORS 



OLSR E 0000 



FL30 C 009A 
FX15 C 0020 
FXTOFL C 0000 
QLSR E 0000 
FETN C 004D 



QMOVE E 0000 



rL40 

rX20 

OVFL 

QMOVE 

ZERO 



00 A2 
002C 
00AD 
0000 
00A7 



Figure 4.2. Float to Fix Conversion Flowchart (Cont.) 
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LCC OBJ 



LINE 



0000 C5 

0001 D5 

0002 E5 

0003 0604 

0005 7E 

0006 12 

0007 23 

0008 13 

0009 05 
000A C20500 
000D El 
000E Dl 
000F CI 

0010 C9 



1 $ 



4 

5 
6 
7 
8 
c 

10 ; 
11 

12 ; 

13 ; 

14 ; 

15 ; 

16 ; 

17 QMOVE 
18 

19 

20 

21 QM10: 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 ; 

32 ; 

33 ; 

34 



SOURCE STATEMENT 
PAGEWIDTH(80) MACROFILE 

QUADRUPLE PRECISION SUBROUTINES 
******************************* 

PUBLIC QMOVE,0TEST,QNEG,OLSL,0LSR,QCLE 

CSEG 

MOVE 4 BITES POINTED TO BT HL 
TO 4 BYTES POINTED BT DE 
M{DE) = M{HL) 



;SAVE ALL REGISTERS 



;GET byte FROM M(HL) 
; STORE BYTE IN M(DE) 
;BUMP SOURCE POINTER 
;BUMP DESTINATION POINTER 

;until 4 times 
;restore all registers 



PUSH B 
PUSH D 
PUSH H 
MVI B,4 
MOV A,M 
STAX E 
I NX H 
I NX D 
DCR B 
JNZ QM10 
FOP H 
POP D 
POP B 
RET 



TEST 4 BYTES POINTED TO HL FOE 
M{HL) = 07 



0011 


E5 


35 


QTEST: 


PUSH H 


;SAVE EL 


0012 


7E 


36 




MOV A,M 


;GET FIRST BYTE 


0013 


23 


37 




INX H 




0014 


B6 


38 




ORA M 


; COMBINE WITH 2ND BYTE 


0015 


23 


39 




INX H 




0016 


E6 


40 




ORA M 


; COMBINE WITH 3RD BYTE 


0017 


23 


41 




INX E 




0018 


B6 


42 




ORA M 


; COMBINE WITH 4TH BYTE 


0019 


El 


43 




POP H 


JRESTORE HL 


001A 


C9 


44 
45 


1 


RET 








46 


f 

L 
! 


NEGATE THE QUAD 


PRECISION NUMBER POINTED TO BY H 






47 


M(HL) = - M{HL) 








48 


» 






001B 


C5 


49 


ONEG: 


PUSH B 


;SAVE BC 


001C 


23 


50 




INX H 


;M0VE HL TO LSB 


001D 


23 


51 




INX H 




0011 


23 


52 




INX H 




001F 


0604 


53 




MVI E,4 





Figure 4.2. Float to Fix Conversion Flowchart (Cont.) 
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LOG OBJ 



LINI 



SOURCE STATEMENT 



0021 B7 

0022 3E00 

0024 9E 

0025 77 

0026 21 

0027 05 

0028 C22200 
002B 23 
002C CI 
0021; C9 



002E C5 
002F 23 

0030 23 

0031 23 

0032 0604 

0034 B7 

0035 7E 

0036 17 

0037 77 

0038 2B 

0039 05 
003A C23500 
003r 23 
003E CI 
003F C9 



0040 C5 

0041 E5 

0042 0604 

0044 B7 

0045 7E 

0046 IF 

0047 77 

0048 23 

0049 05 
004A C24500 

0041; El 

004E CI 
004? C9 



0050 E5 

0051 AF 

0052 77 

0053 23 

0054 77 



54 

55 ON10: 
56 
57 
58 
59 
C 60 
61 
62 
63 

64 ; 

65 ; 

66 ; 

67 ; 

68 QLSL: 
69 

70 
71 
72 
73 

74 QLSL10: 
75 
76 
77 
78 
C 7S 
80 
81 
82 

83 ; 

84 ; 

85 ; 

86 ; 

87 QLSR: 
88 

89 
90 

91 QLSR10: 
92 
93 
94 
95 
C 96 
97 
98 
99 

100 ; 

101 ; 

102 ; 

103 ; 

104 QCLR: 
105 

106 

107 
108 



ORA A 
MVI A,0 
SBE M 
MOV M.A 
BCX H 
DCR B 
JNZ QN10 
INX H 
FOP B 
RET 



; CLEAR CARRY 

; CLEAR A VITEODT AFFECTING CARRY 



; RESTORE HL 
JRESTORE BC 



LOGICAL SHIFT LEFT 4 BYTES POINTED TO HL 
M(HL) = LSL(M(HL)) 



FUSE B 
IN3t H 
INX H 
INX H 
MVI B,4 
ORA A 
MOV A,M 
RAL 

MOV M,A 
ECX H 
DCR B 

JNZ QLSL10 
INX H 
POP B 
RET 



;SAVE BC 

;mOVE POINTED TO LSB 



; CLEAR CARRY 



jRESTORE HL 

;restore BC 



LOGICAL RIGHT SHIFT OF 4 BYTES POINTED TO BY HL 
M(HL) = LSR(M(HL)) 



PUSH B 
PUSH H 
MVI B,4 
ORA A 
MOV A,M 
RAR 

MOV M,A 
INX E 
DCR B 

JNZ OLSR10 
POP H 
POP B 
RET 



JSAVE BC 
;SAVE HL 

; CLEAR CARRY 



;restore el 
;restore BC 



CLEAR 4 BYTES POINTED TO BY HL 
M(EL) = 

PUSH H 
XRA A 
MOV M.A 
INX H 
MOV M.A 



Figure 4.2. Float to Fix Conversion Flowchart (Cont.) 
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LOC OBJ 

0055 23 

0056 77 

0057 22 

0058 77 

0059 El 
005A C9 



LINE 



SCDRCE STATEMENT 



109 


INX H 


110 


MOV M,A 


111 


INX H 


112 


MOV M,A 


113 


POP H 


114 


HET 


115 ; 




lie 


ENE 



PUBLIC SYMBOLS 






QCLR C 0050 


QLSL 


C 002E 


QNEG C 001B 


QTEST 


C 0011 



QLSR C 0040 



QMOVE C 0000 



EXTERNAL SYMBOLS 



USER SYMBOLS 

QCLR C 0050 QLSL 

5LSR10 C 0045 QM10 

QNEG 001B QTEST 



002E 
0005 
0011 



QLSL10 C 0035 
QMOVE C 0000 



QLSR 
QN10 



C 0040 
C 0022 



ASSEMBLY COMPLETE, NO E^RRORS 



Figure 4.2. Float to Fix Conversion Flowchart (Cont.) 



The following is a step-by-step description of the algorithm used 
in the conversion example: 

a. Copy the fixed point number into the location of the floating 
point number. 

b. Test the floating point number to see if it is zero. 

c. Return to caller if the number is zero. 

d. The sign is defaulted to (plus). 

e. Default the actual exponent to 23. This is the exponent that 
would be valid if no shift is required, i.e., the most significant 1 
is in bit position 23. Since the Am9512 format has a bias of 
127io the bias is added to the default value to make the 
default exponent 23io + 127io = 150io. 

f. If bit 31 in the floating point register = 1, then the input number 
is a negative number. The number in the floating point register 
is negated (two's complement negation) and the sign is 
set to 1 . 

g. If bits 24-31 of the floating point register are all zeroes, then 



the input number has an exponent less than or equal 23. The 
program transfers to step j for possible left shifts. Otherwise 
the program falls through to h. 

Bits 24-31 are not all zeroes. This means the magnitude of the 
fixed point number is greater than 223. jhe floating point 
register is right-shifted one place and the exponent is in- 
cremented by 1. 

Test bits 24-31 again for all zeroes. If they are not all zeroes, 
repeat step h. If bits 24-31 are all zeroes, shifting is complete 
and the program transfers to step I. 
Bits 24-31 are all zero. If bits 23 = 1, no more shifting is 
required and the program transfers to step I. 
Left-shift floating point register. Decrement exponent by 1 and 
repeat step j. 

Shifting is complete. The exponent is stored into bits 23-30. 
(The original bit 23, the "hidden 1" is overwritten). 
Store the sign into bit 31 of the floating point register. 
Return to caller. 
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4.3 FLOATING POINT TO BINARY FIXED POINT 

Figure 4.2 shows the flowchart of a floating point to fixed point 
conversion flowchart. An Am9080A assembly language sub- 
routine that implements to flowchart is shown in Figure 4.3. The 
following is a step-by-step description of the algorithm: 

a. Copy the floating point number into the fixed point register. 

b. If the floating number is zero, return to caller. 

c. Unpack the floating point number from the fixed point register 
by removing the exponent and sign. The exponent (in the 
unbiased form) and the sign are stored in CPU registers. The 
"Hidden 1" is restored in the fixed point register. 
If exponent is less than 0, zero fixed point register and exit. 
If exponent is larger than 31, set overflow flag and exit. 
Subtract 23 from exponent to derive the shift count. 
If the adjusted exponent is greater than zero, the original 



exponent is greater than 23, the program transfers to step j to 
left shift fixed point register, or else it falls through to step h. 
If the exponent = 0, shift is complete and the program trans- 
fers to step I. 

Right-shift the fixed point number one position and increment 
the exponent by 1. Repeat step h. 

Left-shift the fixed point number by one position and decre- 
ment the exponent by 1. 

If the exponent is not zero, repeat step j; or else, the pro- 
gram falls through to step I. 

Test the original sign of the floating point number. If sign is 
positive skip step m. 

If the sign is negative, negate the number In the fixed point 
register (two's complement). 
Return to caller. 



( START J 



FIX = FLOAT 




EXP = EXP - 23 




LEFT SHIFT 

FIX 

EXP = EXP - 1 





RIGHT SHIFT 

FIX 

EXP = EXP + 1 




c 



J 



Figure 4.3. Fix to Float/Float to Fix Conversion Subroutines 
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4.4 DECIMAL TO BINARY FLOATING POINT CONVERSION 

When a programmer works with binary floating point numbers, it 
is often necessary to convert decimal numbers Into binary floating 
point notation to enter the desired numbers into the machine. 
Figure 4.4 shows the flowchart of such a conversion program and 
Figure 4.5 shows a BASIC program that does the conversion. 

The program uses an array A of 32 elements. Each element of the 
array corresponds to one bit of the floating point number: A(31) is 
the sign bit, A(30) to A(23) represent the exponent and A(22) to 
A(0) represent the mantissa. Other variables used are as follows: 

D - The decimal number entered from console 
E - The exponent of the binary floating point number 
H - An index to the hexadecimal string with range 0-15 
H$ - An ASCII string of all hexadecimal characters used for 
hexadecimal output 



I - An integer used for loop index 

J - A number used for comparison when unpacking the 

exponent and the mantissa 
M - The mantissa of the binary floating point number 

The following equation converts a floating point number from one 
base to another: 

Let E2 = Exponent of new number 
M2 = Mantissa of new number 
B2 = Base of new number 
Ni = Original number 

Given Ni and B2, the equations used to solve E2 and M2 are: 

E2 = INT (LOG (Ni)/LOG (B2)) 
M2 = Ni/(B2 * * E2) 





START 
















ZERO ARRAY 
A(0) - A(31) = 






GET UNBIASED 

EXPONENT 

E = 

INT (LOG D/LOG 2) 






















OUTPUT 
■00000000" 




INPUT 

DECIMAL NO. 

D 






GET MANTISSA 

M = d;2 T E 
























Y 


<<Z> 




GET BIASED 
EXPONENT 
E = E +127 






N 








■^^ D < 0? ^ 


N 




CONVERT EXP 

TO BINARY 

A(30) - A(23) = E 






Y 










SET SIGN 
A(31) = 1 






CONVERT MANT 

TO BINARY 
A(22) - A(0) = M 
















NEGATE D 
D = - D 






OUTPUT 
A(31) - A(0)IN 
HEXADECIMAL 































Figure 4.4. Decimal to Binary Floating Point Conversion Flowchart 
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10 


REM 


20 


REM 


30 


DIM A(32) 


40 


H$ = "0123456789ABCDEF" 


50 


PRINT "INPUT BECIMAL NO. " ; 


60 


INPUT E 


70 


REM CLEAR BINARY ARRAY 


80 


FOR I = TO 31 


90 


A(I) = 


100 


NEXT I 


110 


IF D = THEN 450 


120 


IF D < THEN A(0) = 1 


130 


E = ABS(D) 


140 


REM FIND THE EXPONENT 


150 


E = INT(L0G(D)/L0G(2)) + 1 


160 


M = D/2"E 


170 


REM FORM BINARY ARRAY FOR EXPONENT 


180 


IF E < 1 THEN 250 


190 


J = 126 


200 


FOR I = 1 TO 7 


210 


J = J/2 


220 


IF E >= J THEN A(I) = 1 : E = E - J 


230 


NEXT I 


240 


GOTO 320 


250 


REM E IS LESS THAN 1 


260 


A(l) = 1 


270 


J = - 64 


280 


FOR I = 2 TO 7 


290 


J = J/2 


300 


IF E >= J THEN A (I) = 1 ELSE E = E - J 


310 


NEXT I 


320 


REM FORM BINARY ARRAY FOR MANTISSA 


330 


J = 1 



340 
350 
360 
370 
380 
390 
400 
410 
420 
430 
440 
450 
460 



FOR I = 8 TO 31 

J = J/2 

IF M >= J THEN A(I) = 1 : M = M - J 

NEXT I 

REM FORM HEXADECIMAL NUMBER AND OUTPUT IT 

FOR I = TO 31 STEP 4 

E = 8*A{I) + 4*A(I+1) + 2*A(I+2) + A(I+3) 

PRINT MID${H$,H+1,1); 

NEXT I 

PRINT 

GOTO 50 

PRINT "00000000" 

GOTO 50 



a) Decimal String to Am9511A Floating Point Format 



Figure 4.5. Decimal to Binary Floating Point Conversion Programs 
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10 


REM 




20 


REM 




30 


REM 




40 


REM 




50 


EEFINT A.I ,H 1 


60 


DIM 


A(32) 


70 


H$ = 


0123456789ABCDEF" 


80 


REM 




90 


REM 


CLEAR BINARY ARRAY A(e) TO A(3l) 


100 


REM 




110 


FOR 


I = TO 31 


120 


A(I) 


= 


130 


NEXT 


I 


140 


REM 




150 


REM 


INPUT A DECIMAL NUMBER FROM CONSOLE 


160 


REM 




170 


PRINT 


180 


INPUT "ENTER DECIMAL NUMBER" ;D 


190 


REM 




200 


REM 


CHECK IF INPUT NUMBER IS ZERO 


210 


REM 




220 


IF D 


<> THEN 280 


2S0 


PRINT "00000000" 1 


240 


GOTO 


18 


250 


REM 




260 


REM 


INPUT IS NOT ZERO, CHECK IF IT IS NEGATIVE 


270 


REM 




280 


IF E 


< THEN A(31) = 1 : D = -D 


290 


REM 




300 


REM 


FIND THE UNBIASED EXPONENT 


310 


HEM 




320 


E = 


INT(L0G(D)/L0G(2) ) 


330 


REM 




340 


REM 


FIND THE MANTISSA 


350 


REM 




360 


M = 


D/2"E 


370 


REM 




380 


EIM 


FIND THE BIASED EXPONENT 


390 


REM 




400 


E = 


E + 127 


410 


REM 




420 


REM 


FORM BINARY ARRAY FOR EXPONENT 


430 


REM 




440 


J = 


256 


450 


FOR 


I = 30 TO 23 STEP - 1 


460 


J = 


J/2 


470 


IF E 


>= J THEN A(I)=1 : E=E-J 


480 


NEXT 


I 


490 


HEM 




500 


REM 


FORM BINARY ARRAY FOR MANTISSA 


510 


REM 




520 


M = 


M - 1 : REM STRIP OFF "HIDDEN l" 


530 


J = 


1 


540 


FOR 


I = 22 TO STEP -1 


550 


J = 


J/2 


560 


IF M 


>= J THEN A(I)=1:M=M-J 


570 


NEXT 


I 


580 


REM 




590 


REM 


FORM HEXADECIMAL NUMBER AND OUPUT TO CONSOLE 


600 


REM 




610 


FOR 


I = 31 TO STEP -4 


620 


H = 


8*A(I) + 4*A(I-1) + 2*A(I-2) + A(I-3) 


630 


PRINT MID^(H$,H+1,1); 


640 


NEXT 


I 


650 


SOTO 


110 

b) Decimal String to Am9512 Floating Point Format 



Figure 4.5. Decimal to Binary Floating Point Conversion Programs (Cont.) 
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UNPACK HEXADECIMAL NUMBER INTO A BINARY ARRAY 



10 REM 

20 REM 

30 P.EM 

40 REM 

50 DEFINT H.I.S : DIM E(8) 

60 REM 

70 REM INPUT BINARY FLOATING POINT IN HEXADECIMAL 

80 REM 

90 INPUT "ENTER AN 8 DIGIT HEXADECIMAL NUMBER" ;H$ 

100 REM 

110 REM 

120 REM 

130 FOR I = TO 7 

140 C$ = MID$(H$, 1+1,1) 

150 B(I) = ASC(C$) 

160 IE (H(I) < 48 OR H(I) > 70) THEN 530 

170 IF (H(I) > 57 AND H(I) < 65) THEN 530 

180 B(I) = H(I) - 48 

190 IF H(I) > 9 THEN H(I) = H(I) - 7 

200 NEXT I 

210 REM 

220 REM FIND THE SIGN OF THE NUMBER 

230 REM 

240 S = 

250 IF H(0) > 7 THEN S = 1 

260 REM 

270 REM FIND THE EXPONENT OF THE NUMBER 

280 REM 

290 E = 32*(H(0) AND 7) + 2*H(l) + (E(2) AND 8)/3 - 127 

300 REM 

310 REM FIND THE MANTISSA OF THE NUMBER 

320 REM 

330 H(2) = H(2) AND 7 

340 M = 1 

350 FOR I = 2 TO 7 

360 M = M + H(I)/2"(3+4*(I-2)) 

370 NEXT I 

380 REM 

390 REM FIND THE NUMBER BY COMBINING EXPONENT S, MANTISSA 

400 REM 

410 N = (2"E) * M 

420 REM 

430 REM CHECK SIGN TO SEE IF NEGATION REQUIRED 

440 REM 

450 IF S = 1 THEN N = -N 

460 REM 

470 REM OUTPUT DECIMAL NUMBER 

480 REM 

490 PRINT N : GOTO 90 

500 REM 

510 REM ILLEGAL INPUT DETECTED, ABORT 

520 REM 

530 PRINT "INPUT ERROR, UNKNOWN CHARACTER '";C$;"'" : GOTO 90 



b) Hexadecimal Floating Point 



Figure 4.5. Binary to Decimal Floating Point Conversion Program 
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10 HEM 

20 REM 

30 REM 

40 DEFINT A, I 

50 EEFEEL E-H,J-Z 

60 DIM A(64) 

70 H$ = "0123456789ABCDEF" 

80 INPUT "enter decimal NUMBER"; D 

90 HEM CLEAR BINARY ARRAY 

100 FOR I = TO 63 

110 A(I) = 

120 NEXT I 

130 IF D = THEN 540 

140 IF D < THEN A(0) = 1 

150 E = ABS(D) 

160 REM FIND THE UNEAISED EXPONENT 

170 E = INT(L0G{D)/L0G(2)) 

180 REM USE ITERATIVE LOOP TO FIND 2"E BECAUSE 

190 REM EXPONENTIATION IS NOT EXACT T = 2"E 

200 T = 1 

210 IF E = THEN 320 

220 IF E > THEN 280 

230 REM THE EXPONENT IS NEGATIVE 

240 FOR I = -1 TO E STEP -1 



250 


T = T/2 


260 


NEXT I 


270 


GOTO 320 


280 


FOR I = 1 


290 


T = 2*T 



340 


RIM FORM 


350 


J = 2048 


360 


FOR I = 1 


370 


J = J/2 


380 


IF E >= J 



TO E 

r 

300 NEXT I 

310 REM FIND THE MANTISSA AND BIASED EXPONENT 

320 M = D/T 

330 E = E + 1023 

FORM BINARY ARRAY FOR EXPONENT 

TO 11 

THEN A(I)=1:E=E-J 
390 NEXT I 

400 REM FORM BINARY ARRAY FOE MANTISSA 
410 M = M - 1# 
420 J = 1 

430 FOR I = 12 TO 63 
440 J = J/2 

450 IF M >= J THEN A(I)=1 : M=M-J 
460 NEXT I 

470 REM FORM HEXADECIMAL NUMBER AND OUTPUT IT 
480 FOR I = TO 63 STEP 4 

490 H = 8*A(I) + 4*A(I+1) + 2*A(I+2) + A(I+3) 
500 PRINT MID${H^,H+1,1); 
510 NEXT I 
520 PRINT 
530 GOTO 80 

540 PRINT "0000000000000000" 
550 GOTO 80 



c) Decimal String to Am9512 Floating Point - Double Precision Format 

Figure 4.5. Decimal to Binary Floating Point Conversion Programs (Cont.) 
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10 REM 

20 REM 

30 DEFDBL A-G,K-Z 

35 EEFINT I ,J 

40 DIM C(16) 

50 INPUT "INPUT 16 DIGIT HEXADECIMAL NUMBER ";H$ 

60 REM UNPACK HEXADECIMAL NUMBER INT A BINARY ARRAT 

70 FOR I = TO 15 

80 C$ = MID$(H$, 1+1,1) 

90 C(I) = ASC(C$) - 48 

100 IF C(I) < THEN 290 

110 IF C(I) > 10 TEEN C{I) = C(I) - 7 

120 IF C(I) > 15 THEN 290 

130 NEXT I 

140 REM FIND SIGN OF NUMBER 

150 S = 

160 IF C(0) > 7 THEN S = 1 

170 REM FIND EXPONENT OF NUMBER 

180 E = 256*(C(0) AND 7) + 16*0(1) + C(2) - 1023 

190 REM FIND MANTISSA OF NUMBER 

200 C(2) = C(2) AND 7 

210 M = 1 

220 FOR I = 3 TO 15 

230 M = M + C(I)/2"(4*(I-2)) 

240 NEXT I 

250 N = (2"E) * M 

260 IF S = 1 THEN N = -N 

270 PRINT N 

280 GOTO 50 • 

290 PRINT "INPUT ERROR" 

300 GOTO 50 



c) Double Precision Decimal Number 



Figure 4.5. Binary to Decimal Floating Point Conversion Program (Cont.) 
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4.5 BINARY TO DECIMAL FLOATING POINT CONVERSION 

In order to read the value of a binary floating point number stored 
in a computer, it Is often useful to convert It to a decimal number 
so a person can visualize the number. The conversion from 
binary to decimal is somewhat simpler than from decimal to 
binary. The following Is an algorithm to convert a binary number 
into a decimal number: 

a. Unpack the binary floating point number Into sign (S), un- 
biased exponent (E) and mantissa (M). 

b. Obtain the decimal value of the exponent using an integer 
binary to decimal conversion routine. 

c. Obtain the decimal value of the mantissa using a fractional 
binary to decimal conversion routine. 

d. Obtain the decimal value using 

(-1)Sx2Ex M 



The flowchart in Fig. 4.6 and the basic program in Fig. 4.7 Illus- 
trate an example of such a conversion. The following is a descrip- 
tion of the variables used in the basic program; 



C$ 

E 
H(0)-H(7) 



H$ 



M 
N 



- A single ASCII character used during unpacking 
of the Input string. 

- The exponent of the binary floating point number. 

- Each element of the array represents the value of 
each hexadecimal ASCII character entered. That 
is, each element has the value to 15. 

- The Input string, which should be an 8-digit 
hexadecimal number. Characters entered after 
the eighth character are ignored. 

- An integer used for loop Index. 

~ The mantissa of the binary floating point number. 

- The decimal floating point number. 
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Figure 4.6. Binary to Decimal Floating Point Conversion Flowchart 
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10 REM 

20 REM 

30 REI^ 

40 DIM C(8) 

50 PRINT "INPUT 8 DIGIT HEXADECIMAL NUMBER: "; 

60 INPUT H$ 

70 REM UNPACK HEXADECIMAL NUMBER INTO BINARY ARRAY 

80 FOR I = TO 7 

90 C$ = MID$(E$, 1+1,1) 

100 RIM CHECK IF INPUT IS ZERO 

110 IF H$ <> "00000000" THEN 140 

120 PRINT "0" 

130 GOTO 50 

140 C{I) = ASC(C$) - 48 

150 IF C(I) < THEN 370 

160 IF C(I) > 10 THEN C(I) = C(I) - 7 

170 IF C(I) > 15 THEN 370 

180 NEXT I 

190 REM CHECK IF INPUT IS NORMALIZED 

200 IF (C(2) AND 8) > THEN 230 

210 PRINT "INPUT NOT NORMALIZED FLOATING POINT NO." 

220 GOTO 50 

230 REM FIND SIGN OF NUMBER 

240 S = 

250 IF C(0) > 7 THEN S = 1 

260 REM FIND EXPONENT OF NUMBER 

270 E = 16*(C{0) AND 7) + C{1) 

280 REM FIND MANTISSA OF NUMBER 

290 M = 

300 FOR I = 2 TO 7 

310 M = M + C(I)/2"{4*(I-1) ) 

320 NEXT I 

330 N = (2'E) * M 

340 IF S = 1 THEN N = -N 

350 PRINT N 

360 GOTO 50 

370 PRINT "INPUT ERROR" 

380 GOTO 50 



Figure 4.7. Binary to Decimai Floating Point Conversion Programs 
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CHAPTER 5 
SINGLE-CHIP FLOATING POINT PROCESSORS 



5.1 INTRODUCTION 

Until recently, floating point computation tias been implemented 
either in software or in hardware with MSI/SSI (medium-scale 
integration/small-scale integration) devices. The former method 
involves considerable programming effort and the resulting pro- 
duct is usually very slow. It also consumes valuable main memory 
space for the floating point routines. The latter method involves 
using hundreds of ICs, which requires considerable development 
effort, and the resulting product is expensive to manufacture and 
requires considerable power and space. With the advent of LSI 
(large-scale integration) technology in recent years, it becomes 
possible to put a complete hardware floating point processor into 
a single IC. 

The advantages of the single-chip LSI floating point processor 
compared to previous hardware implementation are as follows: 

Low development cost - 

The cost of developing an interface to a single-chip floating 
point processor should be less than 10 percent of the cost of 
developing a complete hardware floating point processor. 

Low production cost - 

The cost of producing and testing of hardware floating point 
boards is at least several hundred dollars whereas the cost of a 
single-chip processor is only a small fraction of that cost. 

Improved reliability - 

Most electronic failures occur at the interface level. By com- 
bining all the logic inside a single device, the number of con- 
nections in the system is drastically reduced. Hence reliability 
is increased. 

Less power consumption - 

The single-chip processor typically draws less than 5 percent of 

the power of an MSI/SSI implementation. 

Less space - 

The single-chip processor usually fits on the same board as the 

CPU, thus requiring one or two fewer boards than the MSI/SSI 

solution. 

Get product to market sooner - 

Due to less effort required both for development and produc- 
tion, using single-chip processors will shorten the design cycle 
of a new product. 

The advantages of the single-chip LSI floating point processor 
over software floating point computation methods are: 

Enhanced execution speed - 

Hardware floating point processors typically execute floating 
point arithmetic five to 50 times faster than software. If the 
floating point processor allows concurrent CPU execution, the 
overall throughput is even further enhanced for applications 



where the CPU can do other meaningful tasks during a floating 
point computation. 

Low development cost - 

The cost of developing a comprehensive software floating point 
package often involves many manmonths of programming ef- 
fort. With a hardware processor, programming is drastically 
reduced because the floating point computation algorithm is 
preceded inside the hardware processor. 

Less main memory required - 

Since the floating point processors contain the computation 
algorithm on chip (often in microcode), it could save a few 
thousand bytes of main memory. This should be important in 
applications where CPU has limited addressing space. 

Improved portability - 

With the advent of new microprocessors in rapid frequency, 
software often must be rewritten when upgrading from one 
CPU to another. When using the hardware processors, rewrit- 
ing the floating point routines is eliminated. 

The first LSI single-chip floating processors available commer- 
cially were introduced by Advanced Micro Devices. AMD intro- 
duced the Am9511 Arithmetic Processor unit in 1977 and the 
Am9512 Floating Point Processor unit in 1979. 

5.2 Am9511A ARITHMETIC PROCESSOR 

This pioneer single-chip arithmetic processor interfaces with 
most popular 8-bit microprocessors such as Am9080A, Am8085, 
MC6800 by Motorola and Z80 by Zilog. It can also be used for 
16-bit microprocessors such as AmZ8000,* but its performance 
with such 16-bit microprocessors is somewhat hindered by its 
8-bit external data bus. 

Although the external interface is only 8 bits wide, the Am9511A 
internally is a 16-bit microprogrammed, stack-oriented floating 
point machine. It includes not only floating point operations but 
fixed point as well. In addition to the basic add, subtract, multiply 
and divide operations, transcendental derived functions are also 
included. A data sheet of Am9511A is included in Appendix A. 

5.3 Am9512 FLOATING POINT PROCESSOR 

The Am9512 is a follow-up to the Am9511A. Although the 
hardware interface between the two chips is similar, the data 
formats are different. 

The Am9512 supports two data types: 32-bit binary floating point 
and 64-bit binary floating point. The formats adopted are com- 
patible with one of the proposed IEEE formats. Unlike the 
Am9511A, the Am9512 does not have any of the derived trans- 
cendental functions. A description of the Am9512 is included in 
Appendix B. 



•Z8000 is a trademark of Zilog, Inc. 
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CHAPTER 6 
SOME INTERFACE EXAMPLES 



6.1 INTRODUCTION 

This chapter describes examples of interfacing some of the 
popular microprocessors to the Am9511A and Am9512 single- 
chip floating point processors. The examples given are for con- 
ceptual illustration only, minor timing details may need to be 
modified for systems running at nonstandard clock rates. 

6.2 Am9080A TO Am9511A INTERFACE 

Figure 6.1 illustrates a sample interface for an Am9080A 8-bit 
microprocessor to an Am951 1 A. The system controller that inter- 
faces to the Am951 1 A is an Am8238 and not an Am8228 because 
the low (or MEMW) from the Am8228 will appear too late to put 
the Am9080A into the WAIT state. This could cause possible 
overwriting of Am9511A internal registers. 

In the example illustrated, the CS input comes from an address 
comparator Am25LS2521 (8-bit comparator). Note that the chip 
select decoder must not be strobed with lOR or lOW, because 
doing so will cause CS to go LOW after lOR or lOW went LOW. 
The Am9511A chip select to read or write time has a minimum 
setup time of 0. Strobing the chip select decoder will cause the 
setup time to be negative and cause the Am951 1 A to malfunction. 



Note that the Am9511 CS (but not the Am9511A) requires a 
high-to-low transition for every read or write cycle. This means 
that the address decode should be as explicit as possible to 
guarantee a low-to-high transition on the address decode. In Fig. 
6.1, only low-order address locations are used and an Am9080A 
program cannot form a read/write loop in 2 bytes; a transition on 
the address comparator is guaranteed. If using 4-bit comparator 
instead of 7-bit comparator, the program could form a read/write 
loop in 16 bytes. If the loop memory address always coincides 
with the Am9511 I/O address, there will not be a transition on the 
comparator output and the Am9511 will not function properly. 
Although the Am9080A duplicates the I/O address on Ag-A^g, 
these address lines should not be used for Am9511 address 
decode because if the program is executing in a region where the 
upper 7 bits of address match the Am9511 I/O port number, no 
chip-select transition may occur. 

The example shows an interrupt driven interface. At the end of 
every Am9511A operation, the END signal goes LOW. This 
causes the Am9080A to go into an interrupt-acknowledge se- 
quence. Since the INTA on the Am8238 is pulled to +12V through 
a'lK resistor, the data bus is pulled to all 1's during the interrupt- 
acknowledge cycle. This generates an RST 7 instruction to the 
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Figure 6.1 . AmQOBOA to Am9511 A Interface 
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Am9080A. The Am9080A stores the current program counter on 
the stack and jumps to location 38H to execute the interrupt 
handling routine. By pulling the BACK HIGH, the END output will 
stay LOW until the first read/write operation is performed on the 
Am9511A, thus clearing the interrupt request. 



6.3 Am9080A TO Am9512 INTERFACE 

Figure 6.2 illustrates an example of interfacing the Am951 2 to the 
AM9080A. The principal timing difference between the Am9511 A 
and the Am9512 is that the PAUSE follows RD or WR in the 
AM9511A whereas the PAUSE follows CS in the Am9512. 

Two additional gates (74LS08 and 74LS32) are inserted in the 
PAUSE to RDYIN line. Otherwise, during a memory cycle in 
which the memory address bits 1 to 7 match the I/O address of the 
Am9512, the PAUSE will go LOW. Since there will be no lOR or 
low in that cycle to reset the PAUSE, the system will be dead- 
locked. The additional gates allow the PAUSE to pass through 
only if the current cycle is an I/O cycle. Strobing the chip select 
decoder with lOR or lOW will not work because that will create a 
negative chip select to RD or WR setup time, which is not permit- 
ted with the Am9512. Other considerations about the chip-select 
decoding are the same as discussed in Section 6.2. 

The 74LS32 gate shown at the top of Figure 6.2 allows either END 
or ERR to interrupt to CPU. The CPU can read the status register 
of the Am9512 to determine the source of the interrupt. 



6.4 Am8085A to Am9511-1 INTERFACE 

In a typical Am8085A system, the system clock rate is 3MHz. The 
Am9511-1 is selected because the Am9511-1 has as a maximum 
clock rate of 3 MHz. The Am8085A has an earlier ready setup 
window compared with the Am9080A. If the PAUSE signal is 
connected directly to the READY input to the Am8085A, the ready 
line will be pulled down too late for the Am8085A to go into the 
WAIT state. The 74LS74 is used for forcing one WAIT state when 
the Am9511-1 is accessed. After the first WAIT state, the 74LS74 
Q output is reset to HIGH and the PAUSE of the Am9511-1 
controls any additional wait states if necessary. The chip-select 
decoder is strobed with lO/M signal to prevent Am9511-1 re- 
sponding to memory accesses when bits 9 to 15 of the memory 
address coincides with Am9511-1 I/O address. 

6.5 Am8085A TO Am9512-1 INTERFACE 

The Am9512 is designed specifically to interface to 
Am8085A.The interface is straightforward and no additional logic 
is required. The Am9512-1 is used instead of Am9512 because 
the typical Am8085A system runs at 3 MHz. 

The ERR output and END output are connected to separate 
interrupt inputs so that the CPU can identify the souce of interrupt 
without reading the status register of the Am9512-1. 

Since the chip-select decoder is strobed with the lO/M signal, a 
transition is guaranteed with each I/O operation without the con- 
cern of insufficient address decode as in the Am9080A to 
Am9511A or Am9512 interfaces. 
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6.6 Z80 TO Am9511A INTERFACE 

Figure 6.5 illustrates a programmed I/O interface technique for 
Am9511A witfi a Z80 CPU. 

The Chip Select (CS) signal is a decode of Z80 address lines 
A1-A7. This assigns the Am9511AtQ two consecutive addresses, 
an even (Data) address, and the next higher odd (Command) 
address. Selection between the Data (even) and the Command/ 
Status (odd) ports is by the least significant address bit AO. 

The lORQ (Input/Output Request) from the Z80 is an enable input 
to the Am25LS139 decoder. The WR and RD from the Z80 are the 
two inputs to the decoder. The outputs Y1 and Y2 are tied to WR 
and RD of the Am9511A. The PAUSE output of the Am9511 is 
connected to WAIT line of Z80. The Am9511A outputs a LOW on 
PAUSE 150ns (max) after RD or WR has become active. The 
PAUSE remains LOW for 3.5 TCY + 50ns (min) for data read and 
is LOW for 1.5 TCY + 50ns (min) for status read from Am9511A 
where TCY is the clock period at which Am9511A is running. 
Therefore, Z80 will insert one to two extra WAIT states. The 
Am9511A PAUSE output responds to a data read, data write, or 
command write request received while the Am9511A is still oc- 
cupied (executing a previous command) by pulling the PAUSE 
output LOW. Since PAUSE and WAIT are tied together, as soon 
as Z80 tries to interfere with APU execution, Z80 enters the WAIT 
state. 

6.7 Z80 TO Am9512 INTERFACE 

The Am9512 interface to Z80 (Fig. 6.6) requires two more gates 
than the Am9511A interface to Z80. An inverter is added to the 
interrupt request line because the sense of the END/ERR signals 



are different. The 74LS32 is added in the wait line because the 
Am9512 PAUSE will go LOW whenever chip select on the 
Am9512 goes LOW. In Fig. 6.6 the chip-select input can go LOW 
during second or third cycles of an instruction when the memory 
address matches the Am9512 I/O addressed. If the 74LS32 OR- 
gate is omitted, the WAIT input on the Z80 will go LOW and the 
system will be deadlocked. Strobing the chip-select decoder will 
not work because this would cause a negative chip select to RD or 
WR time on the Am9512. 

The chip select decoder in this example is strobed with Ml. This 
accomplishes a dual purpose. It not only guarantees a chip select 
transition on every I/O cycle, it also prevents the chip select to go 
LOW during an interrupt acknowledge cycle. This is vital because 
lORQ is also LOW during that cycle. Without the Ml strobe, CS 
might go LOW and cause PAUSE to go LOW which will again 
cause the system to deadlock. 

6.8 MC6800 TO Am9511A INTERFACE 

Figure 6.7 shows interface of a Motorola MC6800 microproces- 
sor to an Am951 1 A. The MC6800 has no explicit I/O instructions. 
All I/O devices are treated as memory locations. Therefore the 
chip-select input of the Am9511A is derived from a decode of 
address lines A.| to A.15. The decoder is strobed by VMA (Valid 
Memory Address) to produce a glitch-free output. The C/D input 
of the Am951 1 A is connected directly to the Aq of the MC6800 so 
that the even address selects the data port and odd address 
selects the status or command port. The RD and WR inputs to the 
Am9511A is derived by demultiplexing the O2 and VMA and the 
R/W signals. 
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The Am9511 A has a relatively long read access time. To read the 
Am9511A data or status registers, the RD pulse to the Ann9511A 
must be stretched and the clock to the Am951 1 A clock must keep 
running because the read access time is a function of the propa- 
gation delay and the number of clock cycles. The MC6871 A clock 
driver chip provides a perfect solution to the problem. It has a 
memory ready input to stretch the O2 HIGH time and a 2XFC 
free-running clock output that is not affected by memory ready 
input. The standard MC6800 uses a 1MHz clock so that 2XFC is 
at 2MHz, which is the ideal frequency for an Am9511A. When a 
CS to the Am9511A is decoded, the Am26S02 one-shot is 
triggered to pull the memory ready line LOW for approximately 
500ns. This allows the PAUSE output to take control of the 
memory ready. The one-shot is necessary because PAUSE will 
not go LOW soon enough to stretch out O2 in the current cycle. 

Since the MG6800 is a dynamic device and the clock input must 
not be stopped for more than 5 microseconds, the programmer 
must not perform operations other than a status read while a 
current command is still in progress. This avoids producing a 
PAUSE output longer than 5 microseconds. The programmer 
should check the status register to verify that the Am9511 A is not 
busy before performing any operation other than a status read. 

6.9 MC6800 TO Am9512 INTERFACE 

The MC6800 interface to Am9512 (Fig. 6.8) is somewhat simpler 
than the MC6800 to Am9511A interface. All the discussions in 
Section 6.8 also apply to this section except for the one-shot. 



Since the PAUSE output from the Am951 2 follows the CS instead 
of RD or WR, the memory ready signal can be directly driven by 
the PAUSE output. The only other addition is the inverter between 
the END output of the Am9512 to the IRQ input. 

The software considerations concerning the possibility of exces- 
sive PAUSE time discussed in the previous section also apply to 
the Am9512 interface. 

6.10 AmZ8002 TO Am9511A INTERFACE 

The Am9511A can also be interfaced to a 16-bit microprocessor 
such as the AmZ8002. Since the data bus of the Am9511 A is only 
8 bits wide, the operations performed must be byte-oriented. 

The RD and WR inputs to the Am9511A can be obtained by 
demultiplexing the data strobe (DS) output of the AmZ8002. The 
data bus of the Am951 1 A can be connected to either the upper 8 
bits or the lower 8 bits of the AmZ8002 data bus. If the Am9511A 
data bus is connected to the upper 8 bits (Fig. 6.9), the I/O 
address of the Am9511A is always even. If the Am9511 A data bus 
is connected to the low 8 bits, the I/O address is always odd. 
The chip select is derived from a decode of A2 to A15. Ai is 
used to select between data/status during READ and data/ 
command during WRITE. 

Due to the long READ access time of the Am9511 A, the AmZ8002 
must be put in a WAIT state for each READ access to the 
Am9511A. If the PAUSE output of the Am9511A is connected 
directly to the WAIT input of the AmZ8002, the PAUSE output will 
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arrive too late to put the AmZ8002 into the WAIT state. Tlie 
Am25LS195A 4-bit shift register is used to solve this problem. 
During each address strobe, the Qq output will be forced LOW if 
chip select to the Am9511A is present. The Qq will remain LOW 
for two clock periods. If PAUSE is LOW during this period, the 
WAIT line will remain LOW because the Am25LS195A is held at 
the reset state. After the PAUSE returns to high the Oq output will 
go HIGH after two clocks and the AmZ8002 can proceed with the 
current operation. An alternative method of handling the PAUSE 
line is use a one shot as in Fig. 6.7. 



6.11 AmZ8002 TO Am9512 INTERFACE 

The AmZ8002 to Am9512 interface is similar to the AmZ8002 to 
Am9511 A interface, except the PAUSE output of the Am9512 can 
be connected directly to the WAIT input of the AmZ8002. This is 
because the PAUSE output of the Am9512 follows the chip select 
instead of RD or WR and the AmZ8002 has sufficient time to go 
into the WAIT state. Figure 6.10 illustrates interfacing the Am9512 
with the AmZ8002. 
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-^o^ 



D 
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RESET 
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Figure 6.10. AmZ8002 to Am9512 Interface 
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CHAPTER 7 
Am9511A INTERFACE METHODS 



7.1 INTRODUCTION 

Interfacing the Am9080A to the Am951 1 A can be accomplished in 
one of the following ways: 

1. Demand/wait 

2. Poll status 

3. Interrupt driven 

4. DMAtransfer 

The various tradeoffs of these methods are discussed below. 
Although only the Am9080A and Am9511 A are used as an exam- 
ple, the principle applies to any of the processors discussed in 
Chapter 6. 

7.2 DEMAND/WAIT 

This interface is the simplest both in terms hardware and 
software. The connection is shown in Fig. 6.1, except that the 
interrupt input to the Am9080A need not be connected to the END 
output of the Am9511A. When this interface is used, the pro- 
grammer can regard the Am951 1 A as always ready for READ and 
. WRITE operations. If the Am9511A is not ready, the PAUSE will 
go LOW to put Am9080A in the WAIT state. When the Am9511A 
has completed the current operation, the PAUSE will go HIGH 
and the suspended READ and WRITE will proceed. Figure 7.1 
shows an example of a program that loads the data into the 
Am9511A, executes a command and retrieves the data from the 
Am9511A. 

The drawback of this method is that concurrent processing by the 
CPU is not allowed, and the CPU also cannot respond to other 
interrupts or DMA requests in the system while it is in the WAIT 
state. In systems where above considerations are not important, 
this would be the preferred method. This interface is not applica- 
ble to MC6800 systems because the clock of the MC6800 may 
not be stretched beyond 5 microseconds. 

7.3 POLL STATUS 

The hardware interface of this method is the same as demand/ 
wait. The software (Fig. 7.2) is slightly more complicated. When 
the CPU wants to READ or WRITE to the Am9511A, the status 
register is first read. If the most significant bit is a 1, the Am9511A 
is executing a command. The CPU should refrain from perform- 
ing any operation on the Am9511A except loop back for another 
status read. When the MSB of the status is a 0, the Am9511 has 
finished executing the command and the program can fall through 
to perform a READ or WRITE to the Am9511A. 

This method does not allow the CPU to perform useful concurrent 
tasks, but it does allow the CPU to respond to interrupts and DMA 
requests when it is in the status poll loop. 

7.4 INTERRUPT DRIVEN 

The hardware configuration of the interrupt driven method is 
shown in Fig. 6.1. The CPU would first load the APU data stack 
and then issue a command. During the command execution, the 
CPU would be able to perform other useful tasks in the system. 
When the Am9511A has finished the command, the END output 
goes LOW to issue an interrupt request. When the interrupt 
request is acknowledged by the CPU, the CPU executes a routine 
to fetch from the Am9511A data stack and, if necessary, load up 
the data stack and issue another command. 

This method is most suitable for real-time multitasking systems 
because concurrent execution of the CPU and APU is allowed. 
Figure 7.3 shows an example interrupt handler for Am9511A. 



7.5 DMA TRANSFER 

If ultimate system performance is required, the Am9511A data 
stack can be loaded and unloaded by a DMA controller such as 
the Am9517. To achieve maximum throughput, two channels of 
the Am9517 DMA controller are used in the configuration shown. 
Channel 2 is used to load the Am9511A and channel 3 is used to 
unload the Am9511 result into the main memory. For real-time 
interrupt driven systems, an interrupt controller such as the 
Am9519A should also be used. Figure 7.4 shows the connection 
diagram of such a system and Fig. 7.5 shows a sample program 
to drive such a system. 

The following is the initializing sequence required only after 
power up or system reset: 

1 . The Command Register 

Bit = Don't care (applies to memory to transfer option) 

Bit 1 = Don't care (applies to memory option) 

Bit 3=0, Enable DMA controller 

Bit 4=0, Normal timing 

Bit 5 = 1 , Extended write 

Bit 6=0, DREQ active HIGH 

Bit 7=0, DACK active LOW 

2. The mode register of channel 2; 

Read mode, auto initialize, address decrement, block mode 

3. The mode register of channel 3: 

Write mode, auto initialize, address increment, block mode 

4. The word count register of channel 2: 
Initialized to a count of 8 

5. The word count register of channel 3: 
Initialized to a count of 4 

6. Mask register; 
Channels 2 and 3 cleared 

The word count registers may need to be modified later if the word 
count desired is not the default value. 

The following is a sequence of operations required for each 
Am9511A operation: 

1 . The operand address is written to the base address register of 
channel 2 of the Am9517. 

2. If the word count of the operand is different from the previous 
operation, the new word count is written to channel 2 of the 
Am9517. 

3. The address of the result is written to the channel 3 base 
address register. 

4. A software request is sent to channel 2. 

5. The CPU performs other tasks. 

6. An interrupt is received from channel 2 end of operation signal. 

7. The CPU writes the command word into the command register 
with MSB of the command word set to 1 to indicate DMA 
service required at end of operation. 

8. The CPU is free to perform other tasks. 

9. An interrupt is received from channel 3 end of operation signal. 
The result is now is the desired location in main memory. 

The above method offers maximum concurrent operation of an 
Am9080A and Am951 1 A system. If Am951 1 or Am9512 is used 
instead of Am951 1 A, the mode of transfer of the Am951 7 must be 
in single transfer mode to obtain a transition at the chip select 
input of the Am951 1 or Am9512. 
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LOG OBJ 


LINE 




SOURCE STATEMENT 




1 
2 
3 
4 

5 


f 


FAGEWIDTH(80) MACROFILE NOOBJECT 






#**«##«******:*««***««***:««**«***« 






PROGRAMS FOR CHAPTER 7 OF 




6 
7 
8 
9 
10 




FLOATING POINT TUTORIAL 






********************************* 






NAME CHAP7 




11 








12 




AM9511A ARITHMETIC PROCESSING UNIT 




13 




I/O PORT ASSIGNMENT 




14 






00C0 


15 


APUDR 


EQU 0C0H ;AM9511A DATA PORT 


00C1 


16 


APU R 


EQU APUDR +1 ;AM9511A STATUS PORT 


00C1 


17 


APUCR 


EQU APUSR ;AM9511A COMMAND PORT 




18 








19 




AM9517A MULTIMODE DMA CONTROLLER 




20 




I/O PORT ASSIGNMENT 




21 






00B0 


22 


DMAC 


EQU 0B0H ;AM9517A BASE ADDRESS 


00B4 


23 


CH2ADR 


EQU DMAC+4 J CHANNEL 2 ADDRESS 


00B5 


24 


CH2CNT 


EQU DMAC+5 ; CHANNEL 2 BTTE COUNT 


00E6 


25 


CH3ADR 


EQU DMAC+6 ; CHANNEL 3 ADDRESS 


00B7 


26 


CH3CNT 


EQU DMAC+7 ; CHANNEL 3 BTTE COUNT 


00B8 


27 


CMD17 


EQU DMAC+8 J COMMAND REGISTER 


00B9 


28 


REQ17 


IQU DMAC+9 ;REQUEST REGISTER 


00BB 


2S 


M0D17 


EQU DMAC+0BH ;M0DE REGISTER 


00BD 


30 


C1R17 


EQU DMAC+0DH JMASTER CLEAR 


00BF 


31 


MSK17 


EQU DMAC+0rB {MASK REGISTER 
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33 




AM9519 UNIVERSAL INTERRUPT CONTROLLER 




34 




I/O PORT ASSIGNMENT 




35 






00C2 


36 


UICDR 


EQU 0C2H ;AM9519 DATA PORT 


00C3 


37 


UIC R 


EQU UICDR +1 ;AM9519 STATUS PORT 


00C3 


38 
39 


UICCR 


EQU UICSR ;AM9519 COMMAND PORT 




40 




CSEG 




41 








42 




PROGRAM EXAMPLE FOR DEMAND WAIT INTERFACE 




43 




***** FIGURE 7.1 ***** 




44 








45 




TO CALL THE FOLLOWING PROGRAM, 




46 




ON ENTRY: 




47 




HL = POINTER TO TPE FIRST OPERAND (NOS) 




48 




DE = POINTER TO THE SECOND OPERAND (TOS) 




49 




BC = POINTER TO THE RESULT 




50 
51 
52 




A = THE 2 OPERAND OPCODE 






ON RETURN: 




53 




A = THE STATUS REGISTER OF AM9511A 




54 




ALL POINTERS ARE DESTROYED 



Figure 7.1 . Demand/Wait Programming 
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LOG OBJ 



LINE 



SOUECE STATEMENT 



0000 C5 

0001 F5 

0002 010300 
0005 09 



0006 0604 

0008 7E 

0009 D3C0 
0001 2B 
000C 05 
000D C20800 

0010 EB 

0011 010300 
0014 09 



0015 0604 

0017 7E 

0018 D3C0 
001A 2B 
001B 05 
001C C21700 



001F El 
0020 ri3Cl 



0022 CI 

0023 1E04 
0025 DBC0 ■'■ 

0027 02 

0028 03 

0029 ID 
002A C22500 



002B DBCl 
002F C9 



55 ; 

56 DEMAND: 
57 

58 
59 

60 ; 

61 ; 

62 ; 

63 

64 DLOOPlj 

65 

66 

67 

68 

69 ; 

70 

71 

72 

73 ; 

74 ; 

75 ; 

76 

77 DL00P2: 

78 

79 

80 

81 

82 ; 

83 ; 

84 ; 

85 

86 

87 ; 

88 ; 

89 ; 

90 ; 

91 ; 

92 ; 

93 

94 

95 DL00P3: 

96 

97 

98 

99 

100 ; 

101 ; 

102 ; 

103 
104 
105 $ 



PUSH B 
PUSH PSW 
LXI B,3 
DAD B 



JSAVE RESULT POINTER 
fSAVE OPCODE 



;M0VE SOURCE POINTER TO LSB 
PUSH OPERAND #1 ONTO APU DATA STACK 



MVI B,4 
MOV A.M 
OUT APUDR 
BCX H 
DCR B 

DLOOPl 



JNZ 



XCHG 
LXI B,3 
DAD B 



ilNIT LOOPl COUNTER 
; FETCH A BYTE FROM OPEP 1 
;PUSE ONTO APU DATA STACK 
;DEC. BTTE POINTER 
;DEC. LOOP COUNTER 



iPUT OPERAND 2 POINTER IN HI 
JMOVE POINTER TO LSB 



PUSH OPERAND #2 ONTO APU DATA STACK 



MVI B,4 
MOV A,M 
OUT APUDR 
DCX H 
DCR B 
JNZ DL00P2 



; fetch a byte from oper 2 
jpush onto apu, data stack 
;dec. byte pointer 
;dec. loop counter 



OPERAND LOAD COMPLETE, WRITE COMMAND 



POP PSW 
CUT APUCR 



JRETRIEVE COMMAND OPCODE 
; WRITE TO APU COMMAND PORT 



READ DATA FROM STACK 
IF THE APU IS NOT READY, THE PAUSE 
SIGNAL WILL PUT AM9080A INTO THE 
WAIT" STATE UNTIL THE DATA IS READY 



POP B 
MVI E,4 
IN APUDR 
STAX B 
INX B 
DCR E 
JNZ DL00P3 



iRETRIEVE RESULT POINTER 
JINIT L00P3 COUNTER 
;READ APU STACK 
; STORE RESULT IN MEMORY 



RETURN STATUS IN A 

IN APUSR 

RET 

EJECT 



Figure 7.1. Demand/Wait Programming (Cont.) 
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LOG OBJ 



LINE 



SOURCE STATEMENT 



0030 C5 

0031 F5 

0032 010300 
0035 09 



0036 DBCl 

0038 E7 

0039 rA3600 



003C 0604 
003E 7E 
003? E3C0 

0041 2B 

0042 05 

0043 C23E00 

0046 EB 

0047 010300 
004A 09 



004B 0604 
004E 7E 
004E D3C0 

0050 2B 

0051 05 

0052 C24D00 



0055 Fl 

0056 B3C1 



0058 CI 

0059 1E04 



005B DECl 
005E B7 
005E EA5B00 
0061 F5 



106 ; 

107 ; 

108 ; 

109 ; 

110 POLL: 
111 

112 

113 

114 ; 

115 ; 

116 ; 

117 CHKl: 

lie 

119 

120 ; 

121 ; 

122 ; 

123 

124 PLOOPl: 

125 

126 

127 

128 

129 ; 

130 

131 

132 

133 ; 

134 ; 

135 ; 
136 

137 PL00P2: 

138 

139 

140 

141 

142 ; 

143 ; 

144 ; 
145 
146 

147 ; 

148 ; 

149 ; 

150 
151 

152 ; 

153 ; 

154 ; 

155 CHK2: 
156 

157 
158 

159 ; 

160 ; 



SUBROUTINE FOR POLL STATUS INTERFACE 
**** FIGURE 7.2 ***** 



PUSH B 
PUSH PSW 
IX I B,3 
DAD B 



;SAVE RESULT POINTER 
iSAVE OPCODE 



;move pointer to lsb 
check if am9511a is heady to accept data 



IN APUSR 
ORA A 
JM CHKl 



;READ APU STATUS 

;SET CPU FLAGS 

iLOOP BACK IF NOT READY 



THE AM9511A IS READ IF FALLEN THROUGH 



MTI B,4 
MOV A,M 
OUT APUDR 
DCX H 
DCR B 
JNZ PLOOPl 



;INIT LOOPl COUNTER 
; FETCH FROM OPERAND 1 
;PUSH ONTO APU DATA STACK 
;DEC. byte POINTER 
;DEC. LOOP COUNTER 



;PHT OPERAND 2 POINTER IN HL 
;M0VE POINTER TO LSB 
PUSH OPERAND #2 ONTO APU DATA STACK 



XCHG 
LXI B,3 
EAD B 



MVI B,4 
MOV A,M 
OUT APUDR 
DCX H 
DCR B 
JNZ PL00P2 



;init l00p2 counter 
; fetch from operand 2 
;push onto apu data stack 
;dec. byte pointer 

fDFC. LOOP COUNTER 



OPERANDS LOADED, WRITE COMMAND 



POP PSW 
CUT APUCR 



;RETRIEVE OPCODE 



JWRITE COMMAND TO APU 
SET UP RESULT POINTER AND L00P3 COUNTER 



POP B 
MVI E,4 



;RETRIEVE RESULT POINTER 
;INIT LOOPS COUNTER 



WAIT UNTIL AM9511A FINISH EXECUTION 



IN APUSR 
ORA A 
JM CHK2 
PUSH PSW 



;REAL APU STATUS PORT 
iSET STATUS FLAGS 
;L00P BACK IF NOT READY 
;SAVE APU STATUS 



THE AM9511A EAS FINISHED EXECUTION 



Figure 7.2. Status Poll Programming Interface 
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LOC OBJ 



0062 DBC0 

0064 02 

0065 03 

0066 ID 

0067 C26200 



006A Fl 
006B C9 



LINI 

161 
162 
163 
164 
165 
166 
167 
168 
169 
170 
171 
172 
173 



SOURCE STATEMENT 



READ RESULT 



PL00P3: 



IN APUDR 
STAX B 
INX B 
DCR E 
JNZ PL00P3 



iREAD APU DATA STACK 
; STORE RESULT IN MEMORY 
;iNC, MEMORY POINTER 
;DEC. LOOP COUNTER 



EXECUTION COMPLETE, RESTORE STATUS IN A 



POP PSW 

RET 

EJECT 



i RESTORE APU STATUS 



Figure 7.2. Status Poll Programming Interface (Cont.) 
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LOG 


OBJ 


LINI 




SOUECE STATEMENT 






174 












175 




SUERODTINES FOR INTERRUPT DRIVEN INTERFACE 






176 




***** FIGURE 7.3 ***** 






177 












178 




LOCATE INTERRUPT HANDLER IN RST 7 LOCATION | 






179 












180 




AS EG 




0038 




181 
182 


1 


ORG 38 H 




0038 


F5 


183 


F.ST7: 


PUSH PSW 


;SAVE ALL REGISTERS USED 


039 


C5 


184 




PUSH B 




003A 


E5 


185 




PUSH H 




003B 


0604 


186 




MVI B,4 


JINIT LOOP COUNTER 


003D 


2A0000 


D 187 




LHID RSTPTR ; FETCH RESULT POINTER | 






188 


• 






0040 


DBC0 


189 


ILOOPl 


IN APUDR 


iREAD RESULT FROM APU 


0042 


77 


190 




MOV M,A 


;STORE IT IN MEMORY 


0043 


23 


191 




I NX H 


;bump memort pointer 


0044 


05 


192 




ECR B 


;dec. loop counter 


0045 


C24000 


193 
194 




JNZ ILOOPl 








195 




DONE, SET 


DONE FLAG AND RESTORE REGISTERS 






196 








0048 


3E01 


197 




MVI A,l 




004A 


320200 


D 198 




STA DONE 




004E 


11 


199 




POP H 




004E 


CI 


200 




POP B 




004F 


Fl 


201 




POP PSW 




0050 


C9 


202 
203 




RET 








204 




SUBROUTINE 


TO LOAD APU STACK AND SEND 






205 






COMMAND WORD 






206 












207 




CALLING SQUENCE: | 






208 




ON ENTRT 


HL = POINTER TO MSB OF 8 BYTES 






209 






OF OPERAND 






210 






DE = POINTER TO 4 BYTES OF RESULT 






211 






A = EXECUTION OPCODE 






212 












213 




ON RETURN: 


ALL REGISTER ARE NOT AFFECTED, 






214 






DONE FLAG CLEARED. 






215 












216 




CSEG 








217 








006C 


ES 


218 


LOAD: 


PUSH H 


JSAVE OPERAND POINTER 


006D 


D5 


21S 




PUSH D 


;SATE RESULT POINTER 


0061! 


F5 


220 

221 


» 


PUSH PSW 


JSAVE OPCODE 


006F 


110800 


222 




LXI D,8 


;OPER. OFFSET, E = L00P2 CTR 


0072 


19 


223 
224 




DAD D 


;move operand pointer to lsb 






225 




CHECK AM9511A STATUS | 






226 








0073 


DBCl 


227 


LLOOPl 


: IN APUSR 


;READ AM9511 status REG. 


0075 


B7 


228 




ORA A 


JTEST FOR BUSY 



Figure 7.3. Interrupt Driven Programming 
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J 



LOC 


OBJ 




LINE 




SOURCE STATEMENT 


0076 


FA7300 


C 


229 

230 




JM ILOOPl 


;WAIT UNTIL NOT BUSY 








231 




LOAD AM9511 


STACK 








232 








0079 


2B 




233 


LI,0OP2: 


DCX H 


;dec. operand pointer 


007A 


7E 




234 




MOV A,M 


; FETCH 1 BYTE OF OPERAND 


007E 


D3C0 




235 




CUT APUDR 


JLOAD APU DATA STACK 


007D 


IB 




236 




DCR E 


;DEC. LOOP COUNTER 


007E 


C27900 


c 


237 
236 


> 


JNZ LL00P2 




0081 


Fl 




239 




POP PSW 


;get opcode 


0082 


D3C1 




240 




OUT APUCR 


JWRITE TO APU COMMAND REG. 


0084 


210200 


D 


241 




LXI H.DONE 




0087 


3600 




242 




MVI M,0 


; CLEAR DONE FLAG 


0089 


El 




243 




POP H 


;get result pointer 


008 A 


220000 


D 


244 




SHLD RSTPTR 


; STORE IN RESULT POINTER 


0081: 


EB 




245 




XCHG 


; RES TORE DE REG. PAIR 


008E 


El 




246 




P P H 


; RES TORE HL 


008F 


C9 




24? 
248 
249 
250 
251 
252 


• 
> 


RET 

RAM AREA 

BSEG 




0000 






253 


RSTPTR: 


DS 2 


; RESULT POINTER 


0002 






254 
255 


DONE: 


DS 1 
EJECT 


;D0NE FLAG, 1 = DONE 



Figure 7.3. Interrupt Driven Programming (Cont.) 
38 



ISIS-II 8080/8085 


MACRO ASSEMBLEP, V3.0 


CHAP7 PAGE ? 


LOG 


OBJ 


LINE 
256 


SOURCE STATEMENT 








257 


HIGH PERFORMANCE INTERFACE WITH 






258 


AM9517A AND AM9519 






259 


«*** FIGURE 


7,4 **«* 






260 










261 


CSEG 








262 










263 


AM9517A INITIALIZATION ROUTINE 






264 


CALLING SEQUENCE: 






265 


NO PARAMETERS REQUIRED ON ENTRY. 






266 


SOURCE OPERANDS 


ASSUMED TO BE 8 BYTES AND 






267 


RESLUT OPERAND ASSUMED TO BE 4 BYTES | 






268 










269 


ON RETURNED: NO REGISTER AFFECTED 1 






270 






0090 


F5 


271 


:NIT17: PUSH PSW 


JSAVE PSW 


0091 


D3BD 


272' 


OUT CLR17 


{MASTER CLEAR 


0093 


3E20 


273 


MVI A,00100000B 


;load command word 


0095 


r3E8 


274 


OUT CMD17 


JWRITE TO COMMAND REG. 


0097 


3EBA 


275 


MVI A,ieill010B 


;L0AD CH 2 MODE WORD 


0099 


D3EB 


276 


OUT M0D17 


;INIT CHANNEL 2 MODE 


009B 


3E97 


277 


MVI A,10010111B 


iLOAD CH 3 MODE WORD 


009B 


D3BB 


278 


CUT M0D17 


;INIT CHANNEL 3 MODE 


009F 


3E08 


279 


MVI A, 8 


JLOAD CH 2 BYTE COUNT 


00A1 


D3B5 


280 


OUT CH2CNT 


JINIT CH 2 LOW BYTE COUNT 


00A2 


AF 


281 


XRA A 




00A4 


D3B5 


282 


OUT CH2CNT 


JINIT CH 2 HIGH BYTE COUNT 


00A6 


3E04 


283 


MVI A, 4 


;L0AD CH 3 BYTE COUNT 


00A8 


D3B7 


284 


OUT CH3CNT 


;INIT CH 3 LOW BYTE COUNT 


00AA 


AF 


285 


XRA A 




00AB 


B3E7 


286 


OUT CH3CNT 


JINIT CH 3 HIGH BYTE COUNT 


00A1) 


3E03 


287 


MVI A,00000011B 


JLOAD MASK REGISTER PATTERN 


00AF 


D3EF 


288 


OUT MSK17 


JINIT MASK REGISTER 


00B1 


Fl 


2SS 


POP PSW 


; RES TORE PSW 


00E2 


C9 


290 
291 


RET 








292 ; 


SUBROUTINE TO INITIALIZE AM9519 






293 


CALLING SEQUENCE: 






294 i 


ON ENTRY: HL = 


STARTING ADDRESS OF WRITE 






295 J 




COMMAND SUBROUTINE 






296 


DE = 


STARTING ADDRESS OF SET 






297 ; 




DONE FLAG SUBROUTINE 






298 ; 


ON RETURN: NO REGISTERS ARE AFFECTED i 






299 






00B3 


F3 


300 INIT19: DI 


; DISABLE ALL CPU INTERRUPTS 


00B4 


F5 


301 


PUSH PSW 


;SAVE PSW 


00B5 


AF 


302 


XRA A 




00B6 


D3C3 


303 


OUT UICCR 


; SOFTWARE RESET AM9519 


00B8 


3 ESS 


304 


MVI A,100ei000B 


iMODE WORD FOR M0-M4 


00BA 


D3C3 


305 


OUT UICCR 


iSET M0-M4 


00BC 


3EC0 


306 


MVI A,11000000B 


; SELECT AUTO CLEAR REG 


00B£ 


D3C3 


307 


OUT UICCR 




00C0 


3E03 


308 


MVI A,00000011B 


; SELECT CH S, 1 FOR AUTO CLR 


00C2 


D3C2 


309 


OUT UICDR 




00C4 


3EB0 


310 


MVI A,10110000B 


; SELECT MASK REGISTER 



Figure 7.4. DMA Interface Programming 

39 



ISIS-I] 


[ 8080/8085 


MACRO ASSEMBLER, V3.0 




CHAP7 PAGE 8 


LOG 


OBJ 


LINE 


SOURCE STATEMENT 




00C6 


D3C3 


311 


OUT 


UICCR 






00C8 


3EFC 


312 


MVI 


A,11111100B 


iCLR CH S. 1 MASK REG. 


00CA 


D3C2 


313 


OUT 


UICDR 






00CC 


3EF0 


314 


MVI 


A,H110000B 


;SEL CH FOR 3 BITES 


00CE 


D3C3 


315 


OUT 


UICCR 






00D0 


3ECD 


316 


MVI 


A,0CDH 




;9080A 'CALL' OPCODE 


00D2 


D3C2 


317 


CUT 


UICDR 






001)4 


7B 


318 


MOV 


A,E 




;GET CH LOW ADDRESS 


00D5 


D3C2 


319 


OUT 


UICDR 






00D7 


7A 


320 


MOV 


A.D 




;GET CH HIGH ADDRESS 


00D8 


D3C2 


321 


OUT 


UICDR 






00DA 


3EF1 


322 


MVI 


A,11110001B 


;SEL CH 1 FOR 3 BYTES 


00DC 


D3C3 


323 


OUT 


UICCR 






00DE 


3ECD 


324 


MVI 


A,0CDH 




;9080A 'CALL' OPCODE 


00E0 


D3C2 


325 


OUT 


UICDR 






00E2 


7D 


326 


MOV 


A,L 




;GET. CH 1 LOW ADDRESS 


00E3 


I)3C2 


327 


OUT 


UICDR 






00E5 


7C 


328 


MOV 


A,H 




;GET CH 1 HIGH ADDRESS 


0016 


D3C2 


329 


OUT 


UICDR 






00E8 


3EA1 


330 


MVI 


A,10100001B 


;ARM AM9519 


00EA 


D3C3 


331 


OUT 


UICCR 






00EC 


Fl 


332 


FOP 


PSW 




;restore PSW 


00Er 


FB 


333 


EI 






; ENABLE CPU INTERRUPTS 


00E£ 


C9 


334 
335 


RET 












336 


SUBROUTINE 


TO PERFORM AN EXECUTION WITH 






337 


8 BITES OF 


OPERANDS AND 4 BYTES OF RESULT 






338 


CALLING SEQUENCE: | 






339 


ENTRY: 


HL = 


ADDRESS OF OPERANDS 






340 






DE = 


ADDRESS OF RESULT 






341 






A = 


OPCODE 






342 


ON 1 


i?ETURN : 


ALL 


REGISTERS ARE NOT AFFECTED 






343 










00EE 


F5 


344 ] 


]XEC: POSH PSW 




;SAVE OPCODE 


00F0 


320300 D 


345 


STA 


OPCODE 




;INIT OPCODE STORAGE 


00F3 


AF 


346 


XRA 


A 






00E4 


320400 D 


347 


STA 


D0NE2 




; CLEAR DONE FLAG 


00r7 


7D 


348 


MOV 


A.L 






00F8 


D3B4 


349 


OUT 


CH2ADR 




;INIT CH 2 LOW ADDR 


00FA 


7C 


350 


MOV 


A,B 






00rB 


D3B4 


351 


OUT 


CH2ADR 




;INIT CH 2 HIGH ADDR 


00FD 


7B 


352 


MOV 


A,E 






00rE 


D3B6 


353 


CUT 


CH3ADR 




;INIT CH 3 LOW ADDR 


0100 


7A 


354 


MOV 


A,D 






0101 


D3B6 


355 


OUT 


CH3ADR 




;iNIT CH 3 HIGH ADDR 


0103 


3E06 


356 


MVI 


A,00000110B 




0105 


D3B9 


357 


OUT 


REQ17 




; SOFTWARE REQ TO CH 2 


0107 


Fl 


358 


POP 


PSW 




; RESTORE PSW 


0108 


C9 


359 
360 


RET 












361 


INTERRUPT HANDLER #1 TO WRITE COMMAND WORD | 






362 




ro AM9511A WHEN AM9517A HAS FINISHED 






363 




LOADING THE OPERANDS 






364 










0109 


F5 


365 


;NTR1: push PSW 




;SAVE PSW 



Figure 7.4. DMA Interface Programming (Cont.) 
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ISIS-II 8080/8285 MACRO 


ASSEMBLER, V3.0 






CHAP7 PAGE 9 


LOG 


OBJ 


LINI 






SOURCE STATEMENT 






010A 


3A0300 


D 366 






LDA OPCODE 




;GET OPCODE 




010D 


D3C1 


367 






OUT APUCF 


t 




;WRITE TO COMMAND REGISTER 


010F 


Fl 




368 






POP PSW 






; RESTORE PSW 


0110 


FI 




369 






EI 






;re-enable CPU interrupts 


0111 


CS 




370 

371 






RET 
















372 






INTERRUPT HANDLER #2 TO SET 


DONE FLAG 








373 






TO INDICATE OPERATION IS COMPLETE | 








374 
















0112 


FE 




375 


iNTR2: 


PUSH PSW 






;SAVE PSW 




0113 


3E01 


376 






MTI A,l 










0115 


320400 


D 377 






STA D0NE2 




;SET DONE FLAG 


0118 


Fl 




378 






POP PSW 






; RESTORE PSW 


0119 


FI 




379 






EI 






;re-enable CPU interrupts 


011A 


C£ 




380 
381 
382 
383 
384 
385 


J 
> 

7 
t 




RET 

RAM AREA 

DSEG 










0003 






386 


OPCODE: 


DS 1 






;apu opcode 


SAVE AREA 


0004 






387 


D0NE2: 


DS 1 






;done flag 










388 


• 
< 




















389 






END 










PUBLIC 


SYMBOLS 


















EXTERNAL 


SYMBOLS 


' 
















USER SYMBOLS 


















APUCR 


A 


00C1 


APODR 


A 


00C0 


APUSR 


A 


00C1 


CH2ADR A 


00B4 


CH2CNT 


A 


00B5 


CH3ADR 


A 


00B6 


CH3CNT 


A 


00B7 


CHKl C 


0036 


CBK2 


C 


005B 


CLR17 


A 


00BD 


CMD17 


A 


00B8 


DEMAND C 


0000 


DLOOPl 


C 


0008 


DL00P2 


C 


0017 


DL00P3 


C 


0025 


DMAC A 


00B0 


DONE 


D 


0002 


D0NE2 


D 


0004 


EXEC 


C 


00EF 


ILOOPl A 


0040 


INIT17 


C 


0090 


INIT19 


C 


00B3 


INTRl 


c 


0109 


INTR2 C 


0112 


LLOOPl 


C 


0073 


LL00P2 


C 


0079 


LOAD 


c 


006C 


M0D17 A 


00BB 


MSK17 


A 


00EF 


OPCODE 


D 


0003 


PLOOPl 


c 


003E 


PL00P2 C 


004D 


PL00P3 


C 


0062 


POLL 


C 


0030 


REQ17 


A 


00B9 


RST7 A 


0038 


RSTPTR 


D 


0000 


UICCR 


A 


00C3 


UICDR 


A 


00C2 


UICSR A 


00C3 


ASSEMBLY 


COMPLETE, NO 


ERRORS 













Figure 7.4. DMA Interface Programming (Cont.) 
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CHAPTER 8 
FLOATING POINT EXECUTION TIMES 



8.1 INTRODUCTION 

This chapter offers some numerical values of comparing execu- 
tion times between Am9511A, Am9512 and their software coun- 
terparts. The software packages selected are the Intel 
FPAL LIBC^) floating point library and the Lawrence Livermore 
Laboratory BASIC (LLL BASIC). These two software packages 
are selected because the Intel format is the same as the Am9512 
single precision format and the LLL BASIC format is the same as 
the Am951 1 A floating point format. This should offer a reasonably 
comprehensive comparison. 

In the execution-time cycles tables, the cycles given for the 
Am9511A and Am9512 are from the issue of the command to the 
completion of the command execution. The times for loading and 
unloading the operands are not included because these times 
depend on external hardware and also depend on whether the 
calculation is a chain calculation. Similarly, the software cycles 
are counted from the "Call" instruction to the "Ret" instruction of 
the floating point package. Operand setup time is also not 
counted. 

The measurement is conducted on an Intel MDS SOOC^) system 
with an Advanced Micro Computers 95/6011 APU board and 
95/601 2 FPU board. The host is a 2-MHz 8080A. The clock for the 
95/6011 or 95/6012 board is derived from the 9.8304-MHz bus 
clock divided by five to achieve a frequency of 1.96608 MHz. 
Because the main memory of the MDS 800 is dynamic, ther9.,is 
approximately ±0.5% uncertainty of software timing measure- 
ments. Because the bus clock is asynchronous to the CPU clock 
and the i nternal clock of the Am951 1 A and Am951 2 is a two-phase 
clock derived from the single phase bus clock, there is a ±2-clock 
uncertainty in the hardware measurements. 

8.2 FLOATING POINT ADD/SUBTRACT 
EXECUTION TIMES 

Floating point add and subtract usually share the same routine. 
Floating point subtract is merely a change of sign of the sub- 
trahend and is performed as floating point add. For the sake of 
discussion in this chapter, we assume the two operands are of 
like signs. If the operands are different signs, the discussion 
about addition will apply to subtracflon and vice versa. 

The execution time of floating point addiflon is mosfly dependent 
on exponent alignment flme of the two operands, maximum of 



one shift would be required for post-normalization. If the addend 
and the augend have the same exponent, no exponent alignment 
flme is required. If the magnitude of the addend and the augend 
are fairly close, only a few alignment shifts are required. If the 
addend and augend are very different, the number of required 
shifts is large, hence longer execution flme. 

The execution flme of floaflng point subtracflon not only has the 
same exponent alignment flme as in the floaflng point addiflon, it 
also has a post-normalization time. Like floaflng point addiflon, 
the execuflon flme lengthens as the magnitude of the minuend 
diverges from the magnitude of the subtrahend. Unlike the float- 
ing point add routine, the execution flme also lengthens as the 
subtrahend approaches the value of the minuend. This Is due to 
the number of left shifts required to produce a normalized result. 

Table 8.1 shows the cycle times of Am9511A and LLL BASIC 
floaflng point add and subtract rouflnes. Table 8.2 shows the 
cycle flme of Am9512 and Intel floaflng point library execution 
times. The software execuflon flmes given have been normalized 
for a 2-MHz 8080A. 

8.3 FLOATING POINT MULTIPLY/DIVIDE 
EXECUTION TIMES 

Unlike floaflng point add or subtract, the execuflon flmes of float- 
ing point multiply or divide falls within a relaflvely narrow range 
and is not dependent on the relaflve magnitudes of the operands. 
Most mulflplicaflon algorithms use a shift and add method. For 
such algorithms, the execuflon time dependency is mainly on the 
number of 1's in the mulflplier. The number of 1's in the mulflpll- 
cand would not affect the execuflon time. The division execuflon 
flme dependency is more complicated because of the number of 
division algorithms in use. In general, there Is no simple way to 
predict the division execuflon flme of a particular pair of operands 
(Tables 8.3 and 8.4). 

8.4 DOUBLE-PRECISION FLOATING POINT 
EXECUTION TIMES 

The Am9512 supports a double-precision (64-bit) floaflng point 
format. No known 64-bit floaflng point library rouflnes are avail- 
able at this flme. Some sample execuflon flmes are given. The 
operands are selected over a representative range to give a 
comprehensive average (Tables 8.5 and 8.6). 
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TABLE 8.1. Am9511A vs LLL BASIC FLOATING POINT ADD/SUBTRACT EXECUTION TIME COMPARISON 



OPERAND #1 


OPERAND #2 


DEC. 


HEX. 


DEC. 


HEX. 


5 


03A00000 


.0006 


769D4951 


5 


03A00000 


.006 


79C49BA4 


5 


03A00000 


.06 


7CF5C28E 


5 


03A00000 


.6 


00999999 


5 


03A00000 


6 


03C00000 


5 


03A0000e 


60 


06F00000 


5 


03A00e00 


600 


0A960000 



5 03A00000 

123 07r60000 

.123 7DFBE76C 

123 07F60000 

12345 eEC0E400 

1.3579 01ADCEAA 

.000012 70C9539A 

234 08EA0000 

-1.234 819DF3B6 



6000 0DBE8000 

456 09E40000 

456 09E40000 

.456 7FE978D4 

67890 11849900 

24680 0FC0D000 

540000 13A60400 

-678 8AA9S000 

12345 0EC0E400 



AM9511 

FADE FSUB 

214 228 

179 192 

143 156 

95 108 

57 91 

116 120 

153 169 
189 204 
103 108 
213 227 

154 169 
106 131 
238 253 
344 347 
118 96 
238 229 



ILIBASIC 

FADD FSUB 

3395 3884 

3000 3506 

2608 3088 

2100 2578 

1826 2105 

2362 2281 

2540 2805 

2945 3186 

2215 2137 

3220 3467 

2748 3241 

2030 2460 

3469 3727 

4783 5025 

2605 1920 

3890 3367 



TOTAL 2660 2828 45736 48777 

AVERAGE 166.2 176.8 2858.5 3048.6 
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TABLE 8.2. Am9512 vs INTEL FPAL LIB FLOATING POINT ADD/SUBTRACT EXECUTION TIME COMPARISON 



OPERAND #1 


OPERAND #2 


AM9512 


FPAL 


.LIB 


DEC. 


HEX. 


DEC. 


HEX. 


SADD 


SSDB 


FADD 


FSUB 


5 


40A00000 


.0006 


3A1D4952 


254 


275 


2351 


2568 


5 


40A00000 


.006 


3BC49BA6 


229 


217 


1914 


2152 


5 


40A00000 


.06 


3D75C28F 


171 


178 


2506 


2724 


5 


40A 00000 


.6 


3F19999A 


98 


119 


1954 


2178 


5 


40A00000 


6 


40C00000 


' 58 


89 


1430 


1734 


5 


40A 00000 


60 


42700000 


128 


123 


2002 


2165 


5 


40A00000 


600 


44160000 


169 


177 


2455 


2712 


5 


40A 00000 


6000 


45BB8000 


212 


219 


1866 


2159 


123 


42760000 


456 


43E400e0 


114 


109 


1844 


2036 


.123 


3DFBE76D 


456 


43E40000 


264 


283 


2145 


2424 


123 


42760000 


.456 


3EE978D4 


192 


183 


1651 


1878 


12345 


4640E400 


67890 


47849900 


114 


140 


1889 


2279 


1.3579 


3FADCFAB 


24680 


46C0D000 


300 


309 


2435 


2715 


000012 


3749539E 


340000 


48A60400 


475 


477 


1953 


2231 


234 


436A0000 


-678 


C429S000 


124 


101 


2155 


1911 


1.234 


BF9DF3E6 


12345 


4640E400 


284 


?97 


2564 


2284 



TOTAL 3186 
AVERAGE 199.1 



5296 33114 36150 
206.0 2069.6 2259.4 
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TABLE 8.3. Am9511A vs LLL BASIC FLOATING POINT MULTIPLY/DIVIDE EXECUTION TIME COMPARISON 



OPERAND #1 


OPERAND #2 


AM9511 


ILIBASIC 




DEC. 


HEX. 


DEC. 


HEX. 


FMUL 


FDIV 


FMUL 


FDIV 




5 


03A00000 


.0006 


769D4951 


174 


157 


8451 


13013 




5 


03A00000 


.006 


79C49BA4 


174 


178 


8441 


12856 




5 


03A00000 


.06 


7CF5C28E 


149 


177 


8264 


12867 




5 


03A00000 


.6 


00999999 


174 


157 


8407 


13302 




5 


03A00000 


6 


03C00000 


173 


178 


8423 


12835 




5 


03A00000 


60 


06F00000 


148 


179 


8218 


12892 




5 


03A00000 


600 


0A960000 


173 


155 


8415 


12214 




5 


03A00000 


6000 


0DBB8000 


175 


179 


8437 


13020 




123 


07F60000 


456 


09E40000 


148 


156 


8939 


12713 




.123 


7DFBE76C 


456 


09E40000 


148 


157 


10948 


13373 




123 


07F60000 


.456 


7FE978D4 


149 


155 


8965 


12878 




12345 


0EC0E400 


67890 


11849900 


173 


157 


9163 


1430 5 




1.3579 


01ADCFAA 


24680 


0FC0D000 


147 


179 


10591 


13149 




.000012 


70C9539A 


340000 


13A60400 


149 


157 


10018 


13395 




234 


08EA0000 


-678 


8AA98000 


148 


156 


8781 


13509 




-1.234 


819DF3B6 


12345 


0EC0E400 


175 


178 


10971 


12952 










TOTAL 


2577 


2655 


145432 


209273 










AVERAGE 


161.1 


165. 


9 9089. 


5 13079.6 


/ 
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TABLE 8.4. Am9512 vs INTEL FPAL LIB FLOATING POINT MULTIPLY/DIVIDE EXECUTION TIME COMPARISON 



OPERAND #1 


OPERAND #2 


AM 9512 


FPAL 


.LIB 


DEC. 


HEX. 


DEC. 


HEX. 


SMUL 


SDIV 


FMUL 


FDIV 


5 


40A00000 


.0006 


3A1D4952 


234 


250 


3206 


7757 


5 


40A00000 


.006 


3EC49BA6 


256 


235 


3252 


7905 


5 


40A00000 


.06 


3D75C28F 


198 


247 


3088 


7975 


5 


40A00000 


.6 


3F19999A 


234 


248 


3245 


7708 


5 


40A00000 


6 


40C00000 


220 


232 


3052 


7955 


5 


40A00000 


60 


42700000 


200 


246 


2897 


7999 


5 


40A 00000 


600 


44160000 


220 


248 


3072 


7799 


5 


40A00000 


6000 


45BE8000 


220 


246 


3137 


7853 


123 


42F60000 


456 


43E40000 


201 


248 


2903 


7820 


.123 


3EFBE76D 


456 


43E40000 


199 


243 


3087 


7834 


123 


42F60000 


.456 


3EE978D4 


219 


236 


3072 


7822 


12345 


4640E400 


67890 


47849900 


242 


249 


3124 


7585 


1.3579 


3FADCFAB 


24680 


46C0D000 


2 53 


240 


3139 


7854 


000012 


3749539B 


340000 


48A60400 


219 


22B 


3131 


7776 


234 


436A0000 


-678 


C4298000 


201 


234 


2925 


7721 


1.234 


BF9DF3B6 


12345 


4640E400 


223 


227 


3314 


7852 



TOTAL 3539 3857 49644 125215 

AVERAGE 221.2 241.1 3102.8 7825.9 
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TABLE 8.5. Am9512 DOUBLE PRECISION ADD/SUBTRACT EXECUTION TIMES 



OPERAND #1 

DEC. HEX. DEC. 

5 4014000000000000 .0006 

5 4014000000000000 .006 

5 4014000000000000 .06 

5 4014000000000000 .6 

5 4014000000000000 6 

5 4014000000000000 60 

5 4014000000000000 600 

5 4014000000000000 6000 

123 405EC 00000000000 456 

.123 3FET7CED916872B0 456 

123 405EC00000000000 .456 

12345 40C81C8000000000 67890 

1.3579 3FF5E9F559B3D07C 24680 

.000012 3EE92A737110E453 340000 

234 406D400000000000 -678 

-1.234 BrF3BE76C8B43958 12345 



OPERAND #2 

HEX. 
3F43A92A30553261 
3F789374BC6A7EF9 
3FAEB851EB851EB8 
3FE3333333333333 
4018000000000000 
404E000000000000 
4082C00000000000 
40B7700000000000 
407C800000000000 
407C800000000000 
3rDD2riA9rBE76C8 
40F0932000000000 
40D81A0000000000 
4114C08000000000 
C085300000000000 
40C81C8000000000 



AM9512 


DADD 


DSUB 


1273 


1310 


1174 


1211 


105B 


1105 


868 


891 


720 


773 


951 


922 


1091 


1107 


1229 


1244 


906 


877 


1233 


1280 


1072 


1103 


907 


960 


1322 


1352 


2158 


2232 


914 


861 


1309 


1290 



TOTAL 18165 18518 
AVERAGE 1135.3 1157.4 
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TABLE 8.6. Am9512 DOUBLE PRECISION MULTIPLY/DIVIDE EXECUTION TIMES 



OPERAND #1 

DEC. HEX. DEC. 

5 4014000000000000 .0006 

5 4014000000000000 .006 

5 4014000000000000 .06 

5 4014000000000000 ,6 

5 4014000000000000 6 

5 4014000000000000 60 

5 4014000000000000 600 

5 4014000000000000 6000 

123 405EC00000000000 456 

.123 3FBF7CED916872B0 456 

123 405EC00000000000 .456 

12345 40C81C8000000000 67890 

1.3579 3FF5B9F55SB3D07C 24680 

.000012 3EE92A737110E453 3400f0 

234 406D400000000000 -678 

-1.234 BrF3BE76C8B43958 12345 



OPERAND #2 

HEX. 
3F43A92A30553261 
3F789374BC6A7EF9 
3FAEBe51EB851EB8 
3FE3333333333333 
4018000000000000 
404E000000000000 
4082C00000000000 
40B7700000000000 
407C800000000000 
407C800000000000 
3FDD2F1A9FEE76C8 
40F0932000000000 
40D81A0000000000 
4114C08000000000 
C085300000000000 
40C81C8000000000 



AM9512 


DMUL 


DDIV 


1810 


4857 


1814 


4983 


1779 


5048 


1841 


5007 


1785 


4700 


1751 


4699 


1787 


4618 


1786 


4702 


1750 


4671 


1756 


4748 


1744 


4936 


1807 


4696 


1762 


4788 


1755 


4764 


1750 


4670 


1802 


4768 



TOTAL 28479 76655 
AVERAGE 1779.9 4790.9 
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CHAPTER 9 
TRANSCENDENTAL FUNCTIONS OF Am9511A 



9.1 INTRODUCTION 

The word '"transcendental" is defined as "a function that cannot 
be expressed by a finite number of algebraic operations." Three 
examples of such functions are sine, logarithmic and exponentia- 
tion. The Am951 1 A performs a number of such functions, and this 
chapter describes the algorithms adopted by the device. 

9.2 CHEBYSHEV POLYNOMIALS 

Computer approximations of transcendental functions are often 
based on some form of polynomial equations, such as 

f(x) = ag + a^x + A^^ + 33x3 + a4x'^ + . . . 

The most well-known polynomial for evaluating transcendental 
functions is the Taylor series 



f(x) = f(a) + 



\\a) (X - &f 
k! 



Where f'^{a) is the k* derivative of the function f. Taylor series 
usually works well when (x - a) is a small number. When the 
value of (x - a) is large, the number of Taylor series terms 
required to evaluate to a given accuracy becomes large. To avoid 
this shortcoming, there is a set of approximating functions that not 
only minimizes the maximum error but also provides an even 
distribution of errors within the selected data representation inter- 
val. These are known as Chebyshev polynomial functions and 
are based upon the cosine functions. The Chebyshev polyno- 
mials T(x) are defined as follows 

Tn(x) = cos(ncos~^x) 

The various terms of the Chebyshev series can be computed as 

To{x) = cos(O) = 1 
T^(x) = cos{cos~^x) = x 

T2(x) = cos(2cos'^x) = 2cos2(cos~^x) -1 = 2x2 - 1 
in general 

Tn(x) = 2x(Tn_i(x)) - T^.jlx) for n s= 2 



the terms T3(x) 
reference 



T4(x), T5(x) and Tg(x) are given below for 



T3(x) = 4x3 _ 3x 
T4(x) = 8x'* - 8x2 -M 
T5(x) = 16x5 - 20x3 + 5x 
Tg{x) = 32x6 _ 48x4 + 18x2 - 1 

It is not the intent of this book to go into the detailed derivation of 
the Chebyshev series. For readers interested in the formal deri- 
vation, references 1 and 3 are recommended. The Chebyshev 
series is given as follows: 

(X 

f(x) --fCo + S CnTn(x) 



n = 1 



where 

Cn = 



■/: 



f(x) Tn(x) 
yi - x2 



dx 



For a given accuracy, only a finite number of terms is required. 
The Am9511A selects the number of terms required by different 
functions to provide a mean relative error of about one part in 10''. 
The coefficients C^ are all precalculated and stored In the con- 
stant ROM. 



Each of the transcendental functions in the Am9511A uses the 
Chebyshev polynomial series except the square root function. 
Each function is a three-step process as follows: 

Range Reduction - 

The input argument of the function is transformed to fall within a 
range of values for which the function can be computed to a 
valid result. For example, since functions like sine and cosine 
are periodic for multiples of radians, input arguments for these 
functions are converted to He within a range of 

to TT or i~ '° + -f- 

Chebyshev polynomial evaluation - 

This step Is the same for all functions. The algebraic sum of 

the appropriate number of terms of the Chebyshev series is 

computed. 

Postprocessing - 

Some functions, such as sine and cosine, need postprocessing 

of the result such as sign correction. 

The following sections give a detailed function-by-function de- 
scription of each transcendental function in the Am9511A. 

9.3 THE FUNCTIONS CHEBY AND ENTIER 

Two functions are used in the following sections. The first one is 
CHEBY. This function evaluates the Chebyshev polynomial 
series 



f(x) = 1/2Cr 



n-1 
+ X CkTk(x) 
k=1 



The function is called by CHEBY (x, c, n) where x is the input 
argument after any necessary preprocessing; c is the coefficient 
list for the given function; and n is the number of Chebyshev 
polynomial terms used. 

The FORTRAN program to implement the cheby function is as 
follows: 

FUNCTION CHEBY (X, C, N) 
Dimension C(12), T(12) 
T(1) = 1 
T(2) = X 

CHEBY = 0.5 * X(1) + C(2) * T(2) 
DO 100 I = 3, N 

T(l) = 2 ♦ X * T(l - 1) - T(l - 2) 
100 CHEBY = CHEBY 4- C(l) * T(l) 

This program is not written to minimize execution time or code 
space but for its clarity. A program that improves execution speed 
but is somewhat more obscure is as follows: ^ 



100 



FUNCTION CHEBY (X, C, 

DIMENSION C(12), T(12) 

B = 

D = C(N) 

X2 = 2 * X 

DO 100 I = N, 2, -1 

A = B 

B = D 

D = X2 * B - A + C(l - - 

CHEBY = (D - A)/2 

END 



N) 
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The second function is called ENTIER. Entier is the French word 
for integer. The entier function is similar to the FORTRAN integer 
function, except the integer function rounds down to the nearest 
integer closer to zero whereas the entier function rounds down to 
the nearest integer of a lower value. In other words, if the number 
is greater than or equal to zero, both functions are identical. If the 
number is negative, such as -2.5, INT (-2.5) = -2, ENTIER 
(-2.5) = -3. 

A FORTRAN program to implement the entier function is as 
follows; 

FUNCTION ENTIER (X) 
IF (X.LT.O) X = X - 1 
ENTIER = INT (X) 
END 

9.4 SINE 

Any argument of the sine function can be reduced to a value from 
-7r/2 to +ttI2. Hence the range reduction is 

X = X * 2/77 

X = X - 4 * Entier ((X + 1)/4) 
If (X.GT.1) X = 2 - X 

This reduces the input argument to a range from -1 to +1. The 
Chebyshev polynomial evaluation is 

Sin (X) = X ♦ CHEBY (2X2 _ o.l, Csin, Nsin) 

there Csin is an array of precalculated Chebyshev coefficients for 
sine, and Nsin is the number of Chebyshev polynomial series 
used. In the case of Am9511A 



Nsin -- 
Csinn 



Csino 



2.5525579 



9.118016x10 



-3 



Csing = -1.365875 x 10"^ 
Csin^ = 1.184962 x lO'^ 



9.5 COSINE 

Any argument of cosine function can be reduced to a range from 
to TT. Hence, the formulas for cosine range reduction are 

X = X * 2/i7 

X = 4 • Entier ((X + 2)/4) - X + 1 

If (X.GT.1)X = 2 - X 



The cosine function is now evaluated the same way as the sine 
function 

cos(x) = X * CHEBY (2x2 _ i^ csin, Nsin) 
where Csin and Nsin are the same as the sine function 

9.6 TANGENT 

Any argument for tangent can be reduced to a value from -77/2 to 
+77/2. This is the same range reduction algorithm as the sine 
function (Figure 9.1). 

X = X * 2/7T 

X = X - 4* Entier ((X + 1)/4) 

Y = X 

If (Y.GT.I)X = 2 - X 

The Chebyshev polynomial evaluation is 

Tan(X) = X * CHEBY(2X2 - 1, Ctan, Ntan) 
A postprocessing step is also required 

If (Y.GT.I)Tan(X) = 1/Tan(X) 

The constants used in the Am9511A are as follows; 



Ntan ■■ 
Ctang 
Ctan., 
Ctanj 
Ctan, 



1.7701474 
1.0675393 x10~ 



7.5861016 X 10" 
5.4417038 X 10" 



Ctan. = 3.9066370 x 10 



-5 



Ctan= 



2.8048161 X 10" 



Ctang = 2.0137658 x lO"' 
CX&n-f = 1.4458187x10-8 
Ctang = 1.0380510 x 10-^ 



9.7 ARCSINE 

The argument of arcsine must be less than or equal to 1 , or else 
an input error is detected. Hence, range reduction is not neces- 
sary. 

There are two different Chebyshev polynomial expansion used 
depending on the initial value of X. If X^ == 1/2 then the following 
formula is used 
Asin(X) = x* 2 * CHEBY(4x2 - 1, Casin, Nasin) 



-10^° -10" 




DATA VALUES 



Figure 9.1. Tangent 
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If 1/2 < x2 == 1 then 

Asin (X) = sign (X) * ^ • 72 -2x2* 

CHEBY (3 - 4x2, casin, Nasin) 

Where sign (X) is the sign of X. The values of Casin and Nasin 
used in the Am9511A are as follows: 



Nasin 



Casin, 







10 

= 1.4866665 
Casing = 3.8853034 x 10^2 
Casing = 2.8854414 x IQ-^^ 
Casing = 2.8842183 x 10"'* 
3.3223672 x 10"^ 
4.1584779 X lO'S 
Casing = 5.4965045 x 10"'' 
7.5500784 x 10"^ 
1.0671938 X 10-8 
1.5421800x10-9 



Casin4 
Casinc 



CasiHy 
Casing 
Casing 



9.8 ARCCOSINE 

The arccosine is obtained from arcsine by using the trigonometric 
identity. 



Arccosine (x) = -|- 



arcsine (x) 



9.9 ARCTANGENT 

The range reduction of the arctangent function involves taking the 
reciprocal of the input argument if the absolute value of the input 
argument is greater than 1. 

U = X 

If (ABS (U).GT.1)X = 1/X 

The Chebyshev polynomial evaluation is 

Atan(X) = X • Cheby(2X2 - 1, Catan, Natan) 

The postprocessing requirement is 

If (U.GT.1) Atan (X) = 7r/2 - Atan (X) 
If (U.LT.-1) Atan (X) = -ttIZ - Atan (X) 

The value of Natan and Catan used in the Am9511A are: 



Natan 



Catang 

Catan.| 

Catang 

Catang 

Catan4 

Catang 

Catang 

Catany 

Catang 

Catang 

Catan.] Q 

Catanii 



= 11 

= 1.7627472 
= -1.0589292x10-'' 
= 1.1135842 X 10-2 
= -1.3811950 X 10-3 
= 1.8574297 X lO-'* 
= -2.6215196 X 10-5 
= 3.8210366 X 10-^ 
= -5.6991862 X 10""^ 
= 8.6488779 x 10^^ 
= -1.3303384 X lO^^ 
= 2.0685060 X 10-9 
= -3.2448600 x ^0-^° 



9.10 EXPONENTIATION (Figure 9.2) 

The range reduction for the exponentiation function is performed 
by the following formulas 

X = X * LoggO 
N = 1 + Entier (X) 



The Chebyshev polynomial evaluation is 

Exp(X) = 2N • Cheby (2*(N - X) - 1, Cexp, Nexp) 

No postprocessing is required for the exponentiation function. 
The values of Nexp and Cexp used by Am9511A are: 

Nexp = 8 
CexpQ= 1.4569999 
Cexpi = -2.4876243 x 10-1 
Cexpg = 2.1446556 x 10-2 
Cexp3= -1,2357141 xlO-3 
Cexp4= 5.3453058 x 10^^ 
Cexp5 = -1.8506907 x 10-^ 
Cexpg= 5.3411877 x 10"^ 
Cexp7= -1.3215160x10-9 




Figure 9.2. e" 



9.11 NATURAL LOGARITHM (Figure 9.3) 

Any input argument to a logarithm function that is less than or 
equal to zero will be returned as an error input. No preprocessing 
or postprocessing is necessary for all positive input X. 



LN(X) 
•LN2 



CHEBY (4*Mant(X) - 3, CLN, NLN) + (Expo(X) - 1) 



Where Mant(X) is the mantissa value of X and expo (X) is the 
exponent value of X. 

The value of NLN and CLN used in the Am9511A are: 



NLN = 11 



CLNq 
CLN^ 
CLN, 



= 7.5290563 X 10 
= 3.4314575 X 10" 



-1 



-2.9437253 x 10"' 
CLNo = 3.3670893 x 10-3 



CLN4 
CLN5 
CLNg 
CLN7 
CLNg 

CLNg 

CLN. 



10 



= -4.3327589 X lO-'* 
= 5.9470712 X 10"^ 
= -8.5029675 x 10^6 
= 1.2504674 X 10^6 
= -1.8772800 X lO-'' 
= 2.8630251 X IQ-^ 
= -4.4209570 X 10^9 
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Figure 9.3. Natural Logarithm 



Figure 9.4. Square Root 



9.12 LOGARITHM TO BASE 10 (COMIMON LOGARITHM) 

The common logarithm is derived from the natural logarithm by 
the equation 

LOG(X) = LN(X) ♦ LOGige 

where 

LOG^O® = 0.4342945 

9.13 X TO THE POWER OF Y 

The function X to the power of Y is derived from the following 
equation 

XY = e(Y*LN(X)) 

9.14 SQUARE ROOT 

The square root function (Figure 9.4) in the Am951 1 A is the only 
derived function that does not use the Chebyshev polynomials. It 
uses a combination of linear approximation and the Newton- 
Ralfson successive approximation methods. The square root 
algorithm adopted is divided into three parts: 

(a) Range reduction - 

The input argument is divided into the exponent and the 
mantissa. If the exponent is odd, the exponent is incremented 
by 1 and the mantissa is divided by 2. If the input exponent is 
even, the above step is skipped. 

(b) Linear Approximation - 

The mantissa is now a number greater than or equal to 1/4 
and less than 1 . The curve line in Figure 9.5 represents the 
square root of all numbers between 1/4 and 1. The straight 
line represents the first-order approximation for the square 
root of the number. To select the best straight line, we must 
minimize the maximum relative error between the straight 
line and the curve line. This would reduce the worst case error 
to a minimum. This line is known as the minimax line. 

The method used to compute the best linear approximation line is 
as follows: 

Let m = Slope of the minimax line 
Let b = Y intercept of the minimax line 
Let Y = The function of the minimax line 

such that 

Y = mx + b 



The relative error between the actual square root value and the 
first-order approximation is 



E(X) 



mx 4- b - /x" 



Figure 9.5 shows that the absolute value of E(x) is a maximum at 
the two extremities (x = 1/4 and x = 1) and at a point where the 
slope of the curve E(x) = 0, or dE/dx = 0. 



dE 
dx" 



d {mX + b - /x) 
dX Jx 



mx 2 -I bx 2 

dx dx 



(9.1) 



dx 



d -L , d -L 

m -p- x 2 + b-j- X 2 
dx dx 



= 


^ mx-1/2 


therefore 




mx1/2 


= bx-3/2 


X 


b 
m 



-bx 



-3/2 



The relative errors at the extremities are given by 



+ b 



E(i) 



1(1) = 



m 


2 


1 

2 




-2L+ 2b 


- 1 


m 4- b - 


/T 



/T 



m + b 



(9.2) 
(9.3) 
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The minimax line requires these maximum errors to be equal 
1 = m + b ~ 1 



^ + 2b 
2 

b 



^=0 



(9.4) 
(9.5) 



2 

A = ± 
m 2 

m = 2b 

from equations 9.1 and 9.4 

x = ^ = ^ 
m 2 

Therefore, the maximum error in the middle occurs when X = 1/2. 
The minimax line requires these errors to be equal in magnitude. 
Thus 



Hi) - ^™ = -Hi) 



+ b 



(9.6) 



X 

2 
Since m = 2b from equation 9.5 

2b - 



Ki) 



From equations 9.3 and 9.5 

E(1) = 3b - 1 
From equations 9.6, 9.7 and 9.8 



(9.7) 



(9.8) 



2b 



-(3b - 1) = 1 



3b 



2 



2 /2"b -1 = 1 
2 



b =■ 



2 /2~+ 3 



3b 
0.34314575 



From 9.5 
m = 2b = 0.6829150 

Therefore, the minimax line is given by 

Y = 68629150 X + 0.34314575 

This is the equation used in Am9511A for the first-order linear 
approximation. Therefore 

Xq = 0.686291 50X + 0.34314575 

(c) Newton- Ralf son successive approximation - 

After the first-order approximation (Xq) is obtained, the 
Am951 1 A executes two iterations of the Newton- Ralfson ap- 
proximation 



Xi = (X/Xo + Xo)/2 
Xg = (X/Xi + Xi)/2 

And the result is given by 
SQRT(X) = Xo * 2^/2 
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Figure 9.5. Square Root Computation 



A FORTRAN function to illustrate the above algorithm is given 
below: 

FUNCTION ROOT (X) 
INTEGER EXPO, LSB 
REAL MANT, XO, XI, X2 
EXPO = INT (LOG(X)/LOG(2)) 4-1 
MANT = X/2**EXP 
LSB = MOD(EXPO, 2) 
IF (LSB.EQ.O) GOTO 100 
C EXPONENT IS ODD 

EXPO = EXPO + 1 
MANT = MANT/2.0 

100 XO = 0.68629150* Mant + 0.34314575 
XI = (X/XO + X0)/2.0 
X2 = (X/XI + X1)/2.0 

Root = (2**(EXPO/2))»X2 
End 



9.15 DERIVED FUNCTION ERROR PERFORIMANCE 

Since each of the derived functions is an approximation of the true 
function, results computed by the Am9511 A are not always exact. 
In order to quantify the error performance of the component more 
comprehensively, the following graphs have been prepared. 
Each function has been executed with a statistically significant 
number of diverse data values, spanning the allowable input data 
range, and resulting errors have been tabulated. Absolute errors 
(that is, the number of bits in error) have been converted to 
relative errors according to the following equation: 



Relative Error - 



Absolute Error 
True Result 



This conversion permits the error to be viewed with respect to the 
magnitude of the true result. This provides a more objective 
measurement of error performance since it directly translates to a 
measure of significant digits of algorithm accuracy. 
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For example, if a given absoiute error is 0.0001 and tlie true resuit 
is also 0.0001 , it is clearthatthe relative error is equal to 1 .0 (which 
implies that even the first significant digit of the result is wrong. 
However, if the same absolute error is computed for a true result 
of 10000.0, then the first six significant digits of the result are 
correct (0.001/10000 = 0.0000001). 

Each of the following graphs was prepared to illustrate relative 
algorithm error as a function of input data range. Natural 
logarithm is the only exception; since logarithms are typically 
additive, absolute error is plotted for this function. 

Two graphs have not been included in the following fig- 
ures: common logarithms and the power function (X^). Common 
logarithms are computed by multiplication of the natural 
logarithms by the conversion factor 0.43429448 and the error 
function is therefore the same as that for natural logarithm. The 



power function is realized by combination of natural log and 
exponential functions according to the equation 

XY ^ 3yln(x) 

The error for the power function is a combination of that for the 
logarithm and exponential functions. Specifically, the relative 
error for PWR is expressed as 

f^EpWR = R^EXP + X(AE|n) 
where 

REpyyR = relative error for power function 
REgxp = relative error for exponential function 
AE|p| = absolute error for natural logarithm 
X = value of independent variable in X^ 
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